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For: PR0269 POLYPEPTIDES 


) Customer No. 35489 
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ON APPEAL TO THE BOARD OF PATENT APPEALS AND INTERFERENCES 

APPELLANTS' BRIEF 

MAIL STOP APPEAL BRIEF - PATENTS 

Commissioner for Patents 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 
Dear Sir: 

On January 13, 2005, the Examiner made a final rejection to pending Claims 44-46 and 
49-52. A Notice of Appeal was filed on June 13, 2005. 

Appellants hereby appeal to the Board of Patent Appeals and Interferences from the last 
decision of the Examiner. A request for a 1 month extension of time is filed concurrently 
herewith. 

The following constitutes Appellants 1 Brief on Appeal. 
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1 . REAL PARTY IN INTEREST 

The real party in interest is Genentech, Inc., South San Francisco, California, by an 
assignment of the patent application U.S. Serial No. 09/665,350 recorded July 9, 2001, at Reel 
011964 and Frame 0181. 

2. RELATED APPEALS AND INTERFERENCES 

The claims pending in the current application are directed to an antibody which binds to a 
polypeptide referred to herein as "PR0269". There exist two related patent applications, (1) 
U.S. Serial No. 09/902,713, filed July 10, 2001 (containing claims directed to antibodies to 
PR0269 polypeptides), and (2) U.S. Serial No. 09/907,841, filed July 17, 2001 (containing 
claims directed to polynucleotides encoding PR0269 polypeptides). U.S. Serial No. 09/907,841 
has been allowed. U.S. Serial No. 09/904,766 is also under final rejection from the same 
Examiner and based upon the same outstanding rejection, and appeal of this final rejection is 
being pursued independently and concurrently herewith. 

3. STATUS OF CLAIMS 

Claims 44-46 and 49-52 are in this application. 
Claims 1-43 and 47-48 are canceled. 

Claims 44-46 and 49-52 stand rejected and Appellants appeal the rejection of these 

claims. 

A copy of the rejected claims involved in the present Appeal is provided as Appendix A. 

4. STATUS OF AMENDMENTS 

There were no amendments to the claims submitted after final rejection. All previous 
amendments to the claims have been entered. 

5. SUMMARY OF THE INVENTION 

The invention claimed in the present application is related to an isolated polypeptide 
comprising (a) the amino acid sequence of the polypeptide of SEQ ID NO:96; (b) the amino acid 
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sequence of the polypeptide-of SEQ ID NO:96, lacking its associated signal peptide; (c) the 
amino acid sequence of the extracellular domain of the polypeptide of SEQ ID NO:96; or (d) the 
amino acid sequence of the polypeptide encoded by the full-length coding sequence of the cDNA 
deposited under ATCC accession number 209397 (Claims 44-46, 49 and 52). The claims are 
further directed to a chimeric polypeptide comprising a polypeptide according to Claim 44 fused 
to a heterologous polypeptide (Claim 50). The claims are further directed to a chimeric 
polypeptide according to Claim 50 wherein the heterologous polypeptide is an epitope tag or an 
Fc region of an immunoglobulin (Claim 51). 

The full-length PR0269 polypeptide having the amino acid sequence of SEQ ID NO:96 
is described in the specification at, for example, page 12, line 30 to page 13, line 1, page 40, lines 
1-11, page 103, lines 4-12, in Figure 36 and in SEQ ID NO:96. PR0269 is described as a novel 
polypeptide having a signal peptide sequence and a transmembrane domain (see, for example, 
Example 15 and Figure 36). As shown in Example 92 and Table 9 of the specification, PR0269 
showed approximately 2-3.5 fold amplification in 8 primary lung tumors and tumor cell lines, 
(see page Table 9). The cDNA nucleic acid encoding PR0269 is described in the specification 
at, for example, Example 15, in Figure 35 and in SEQ ID NO:95. Page 60, lines 18-22 of the 
specification provides the description for Figures 35 and 36. The preparation of chimeric PRO 
polypeptides, including those wherein the heterologous polypeptide is an epitope tag or an Fc 
region of an immunoglobulin, is set forth in the specification at page 1 16, lines 12-35. Examples 
53-56 describe the expression of PRO polypeptides in various host cells, including E. coli, 
mammalian cells, yeast and Baculovirus-infected insect cells. 

6. ISSUES BEFORE THE BOARD 

I. Whether Claims 44-46 and 49-52 satisfy the utility requirement of 35 USC §101. 
n. Whether Claims 44-46 and 49-52 satisfy the enablement requirement of 35 USC 
§ 1 1 2, first paragraph. 

7. GROUPING OF CLAIMS 

With respect to Issue I, all claims (Claims 44-46 and 49-52) stand and fall together. 
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With respect to Issue II, all claims (Claims 44-46 and 49-52) stand and fall together. 



8. ARGUMENTS 
Summary of the Arguments: 

Issue I: Utility 

Claims 44-46 and 49-52 stand rejected under 35 U.S.C. §101 as allegedly lacking either 
a specific and substantial asserted utility or a well established utility. Appellants have previously 
explained that patentable utility of the PR0269 polypeptides is based upon the gene 
amplification data for the gene encoding the PR0269 polypeptide. The specification discloses 
that the gene encoding PR0269 showed significant amplification, ranging from 2 to 3.5 fold , in 8 
different lung primary tumors and tumor cell lines. Appellants have also submitted, with their 
Response filed February 21 , 2003, the Declaration of Dr. Audrey Goddard, which explains that a 
gene identified as being amplified at least 2-fold by the disclosed gene amplification assay in a 
tumor sample relative to a normal sample is useful as a marker for the diagnosis of cancer, for 
monitoring cancer development and/or for measuring the efficacy of cancer therapy. 
Accordingly, based on the Goddard Declaration and the teachings in the specification, one of 
ordinary skill would find it credible that the claimed PR0269 polypeptides have utility for the 
diagnosis of lung tumors . 

In response to this evidence, the Examiner has asserted that "the specification provides 

data showing a very small increase in DNA copy number, approximately 2 fold, in a few tumor 

samples for PR0269. There is no evidence regarding whether or not the PR0269 mRNA or 

polypeptide levels are also increased in these tumor samples. Since the instant claims are 

directed to PR0269 polypeptide, it was imperative to find evidence in the relevant scientific 

literature whether or not a small increase in DNA copy number would be considered by the 

skilled artisan to be predictive of increased in mRNA and polypeptide levels" (Page 6 of the 

Office Action mailed January 13, 2005). In support of this assertion, the Examiner has cited 

references by Pennica et aL and Konopka et al. as "evidence showing lack for correlation 

between gene amplification and increased polypeptide levels." (Page 6 of the Office Action 

mailed January 13, 2005). The Examiner has cited Haynes et aL, as evidence that polypeptide 
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levels cannot be accurately predicted from mRNA levels. (Page 6 of the Office Action mailed 
January 13,2005) 

Appellants submit that the Examiner applied an improper legal standard when making 
this rejection. The evidentiary standard to be used throughout ex parte examination in setting 
forth a rejection is a preponderance of the totality of the evidence under consideration. Thus, to 
overcome the presumption of truth that an assertion of utility by the applicant enjoys, the 
Examiner must establish that it is more likely than not that one of ordinary skill in the art would 
doubt the truth of the statement of utility. Only after the Examiner has made a proper prima facie 
showing of lack of utility, does the burden of rebuttal shift to the applicant. 

The references cited by the Examiner do not suffice to make a prima facie case that it is 
more likely than not that no generalized correlation exists between gene (DNA) amplification 
and increased polypeptide levels. In particular, the combined teachings of Pennica et al and 
Konopka et al are not directed towards genes in general but to a single gene or genes within a 
single family and thus, their teachings cannot support a general conclusion regarding correlation 
between gene amplification and mRNA or protein levels. Haynes teaches that there is a general 
trend but no strong correlation between protein and transcript levels in yeast. 

In contrast, Appellants have submitted ample evidence to show that, in general, if a gene 
is amplified in cancer, it is more likely than not that the encoded protein will be expressed at an 
elevated level. First, the articles by Orntoft et al, Hyman et al 9 and Pollack et al (made of 
record in Appellants 1 Response filed November 3, 2004) collectively teach that in general gene 
amplification increases mRNA expression . Second, the Declaration of Dr. Paul Polakis (made of 
record in Appellants response filed November 3, 2004), principal investigator of the Tumor 
Antigen Project of Genentech, Inc., the assignee of the present application, shows that, in 
general, there is a correlation between mRNA levels and polypeptide levels . Clearly, the 
research community believes that the information obtained from these chips is useful (i.e., that it 
is more likely than not informative of the protein level). 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is a correlation between DNA, mRNA, 
and polypeptide levels, these instances are exceptions rather than the rule . In the majority of 
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amplified genes , as exemplified by Orntoft et al, Hyman et al, Pollack et al, and the Polakis 
Declaration, the teachings in the art overwhelmingly show that gene amplification influences 
gene expression at the mRNA and protein levels . Therefore, one of skill in the art would 
reasonably expect in this instance, based on the amplification data for the PR0269 gene, that the 
PR0269 polypeptide is concomitantly overexpressed. Thus, the claimed PR0269 polypeptides 
have utility in the diagnosis of cancer. 

Appellants further submit that even if there is no correlation between gene amplification 
and increased mRNA/protein expression, (which Appellants expressly do not concede), a 
polypeptide encoded by a gene that is amplified in cancer would still have a specific, substantial, 
and credible utility. Appellants submit that, as evidenced by the Ashkenazi Declaration (made of 
record in Appellants Response filed May 21, 2004) simultaneous testing of gene amplification 
and gene product over-expression enables more accurate tumor classification , even if the gene- 
product, the protein, is not over-expressed. This leads to better determination of a suitable 
therapy for the tumor. 

Accordingly, Appellants submit that when the proper legal standard is applied, one 
should reach the conclusion that the present application discloses at least one patentable utility 
for the claimed PR0269 polypeptides. 

Issue II: Enablement 

Claims 44-46 and 49-52 stand rejected under 35 U.S.C. §112, first paragraph, allegedly 
"since the claimed invention is not supported by either a specific and substantial asserted utility or a 
well established utility for the reasons set forth above, one skilled in the art clearly would not know 
how to use the claimed invention." (Page 2 of the Office Action mailed January 13, 2005). 

Appellants submit that, as discussed above, the PR0269 polypeptides have utility in the 
diagnosis of cancer. Based on such a utility, one of skill in the art would know exactly how to 
use the claimed polypeptides for diagnosis of cancer, without any undue experimentation. 

These arguments are all discussed in further detail below under the appropriate headings. 



-6- 



Appeal Brief 
Application Serial No. 09/904,766 
Attorney's Docket No. 39780-1 61 8P2C33 



ISSUE I: Claims 44-46 and 49-52 satisfy the utility requirement of 35 USC §101 

Claims 44-46 and 49-52 stand rejected under 35 U.S.C. §101 because allegedly "the 
claimed invention is not supported by either a credible, specific and substantial asserted utility or a 
well established utility." (Page 3 of the Office Action mailed January 13, 2005). 

The Examiner has asserted that "the specification provides data showing a very small 
increase in DNA copy number, approximately 2 fold, in a few tumor samples for PR0269. 
There is no evidence regarding whether or not the PR0269 mRNA or polypeptide levels are also 
increased in these tumor samples. Since the instant claims are directed to the PR0269 
polypeptide, it was imperative to find evidence in the relevant scientific literature whether or not 
a small increase in DNA copy number would be considered by the skilled artisan to be predictive 
of increased in mRNA and polypeptide levels" (Page 6 of the Office Action mailed January 13, 
2005). In support of this assertion, the Examiner has cited references by Pennica et al, Konopka 
et a/., and Haynes et al 9 as "evidence of lack for correlation between gene amplification and 
increased polypeptide levels." (Page 6 of the Office Action mailed January 13, 2005). 

Appellants submit, for the reasons set forth below, that the specification discloses at least 
one credible, substantial and specific asserted utility for the claimed PR0269 polypeptides. 

A. The Legal Standard for Utility 

According to 35 U.S.C. § 101: 

Whoever invents or discovers any new and useful process, machine, manufacture, or 
composition of matter, or any new and useful improvement thereof, may obtain a 
patent therefor, subject to the conditions and requirements of this title. (Emphasis 
added.) 

In interpreting the utility requirement, in Brenner v. Mans on 1 the Supreme Court held that 
the quid pro quo contemplated by the U.S. Constitution between the public interest and the 
interest of the inventors required that a patent applicant disclose a "substantial utility" for his or 
her invention, i.e. a utility "where specific benefit exists in currently available form." The Court 

1 Brenner v. Manson, 383 U.S. 519, 148 U.S.P.Q. (BNA) 689 (1966). 

2 Id. at 534, 148 U.S.P.Q. (BNA) at 695. 
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concluded that "a patent is not a hunting license. It is not a reward for the search, but 
compensation for its successful conclusion. A patent system must be related to the world of 
commerce rather than the realm of philosophy." 3 

Later, in Nelson v. Bowler 4 the C.C.P.A. acknowledged that tests evidencing 
pharmacological activity of a compound may establish practical utility, even though they may not 
establish a specific therapeutic use. The court held that "since it is crucial to provide researchers 
with an incentive to disclose pharmaceutical activities in as many compounds as possible, we 
conclude adequate proof of any such activity constitutes a showing of practical utility." 5 

In Cross v. Iizuka 6 the C.A.F.C. reaffirmed Nelson, and added that in vitro results might 
be sufficient to support practical utility, explaining that "in vitro testing, in general, is relatively 
less complex, less time consuming, and less expensive than in vivo testing. Moreover, in vitro 
results with the particular pharmacological activity are generally predictive of in vivo test results, 
i.e. there is a reasonable correlation there between." 7 The court perceived "No insurmountable 
difficulty" in finding that, under appropriate circumstances, "in vitro testing, may establish a 
practical utility." 8 

The case law has also clearly established that Applicants' statements of utility are usually 
sufficient, unless such statement of utility is unbelievable on its face. 9 The PTO has the initial 
burden to prove that Applicants' claims of usefulness are not believable on their face. 10 In 
general, an Applicant's assertion of utility creates a presumption of utility that will be sufficient 



3 Id. at 536, 148 U.S.P.Q. (BNA) at 696. 

4 Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. (BNA) 881 (C.C.P.A. 1980). 

5 Id. at 856, 206 U.S.P.Q. (BNA) at 883. 

6 Cross v. Iizuka, 753 F.2d 1047, 224 U.S.P.Q. (BNA) 739 (Fed. Cir. 1985). 

7 Id at 1050, 224 U.S.P.Q. (BNA) at 747. 
"Id. 

9 In re Gazave, 379 F.2d 973, 154 U.S.P.Q. (BNA) 92 (C.C.P.A. 1967). 

10 Ibid. 
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to satisfy the utility requirement of 35 U.S.C. §101, "unless there is a reason for one skilled in the 
art to question the objective truth of the statement of utility or its scope." 11 , 12 

Compliance with 35 U.S.C. §101 is a question of fact. The evidentiary standard to be 
used throughout ex parte examination in setting forth a rejection is a preponderance of the 
totality of the evidence under consideration. 14 Thus, to overcome the presumption of truth that 
an assertion of utility by the applicant enjoys, the Examiner must establish that it is more likely 
than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 
Only after the Examiner made a proper prima facie showing of lack of utility, does the burden of 
rebuttal shift to the applicant. The issue will then be decided on the totality of evidence. 

The well established case law is clearly reflected in the Utility Examination Guidelines 
("Utility Guidelines") 15 , which acknowledge that an invention complies with the utility 
requirement of 35 U.S.C. §101, if it has at least one asserted "specific, substantial, and credible 
utility" or a "well-established utility." Under the Utility Guidelines, a utility is "specific" when it 
is particular to the subject matter claimed. For example, it is generally not enough to state that a 
nucleic acid is useful as a diagnostic without also identifying the conditions that are to be 
diagnosed. 

In explaining the "substantial utility" standard, M.P.E.P. §2107.01 cautions, however, 
that Office personnel must be careful not to interpret the phrase "immediate benefit to the public" 
or similar formulations used in certain court decisions to mean that products or services based on 
the claimed invention must be "currently available" to the public in order to satisfy the utility 
requirement. "Rather, any reasonable use that an applicant has identified for the invention that 
can be viewed as providing a public benefit should be accepted as sufficient, at least with regard 

11 In re hanger, 503 F.2d 1380,1391, 183 U.S.P.Q. (BNA) 288, 297 (C.C.P.A. 1974). 

12 See also In re Jolles, 628 F.2d 1322, 206 USPQ 885 (C.C.P.A. 1980); In re Irons, 340 F.2d 974, 144 
USPQ 351 (1965); In re Sichert, 566 F.2d 1154, 1159, 196 USPQ 209, 212-13 (C.C.P.A. 1977). 

13 Raytheon v. Roper, 724 F.2d 951, 956, 220 U.S.P.Q. (BNA) 592, 596 (Fed. Cir. 1983) cert, denied, 469 
US 835 (1984). 

14 In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d (BNA) 1443, 1444 (Fed. Cir. 1992). 

15 66 Fed. Reg. 1092(2001). 
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to defining a 'substantial' utility." 16 Indeed, the Guidelines for Examination of Applications for 
Compliance With the Utility Requirement, 17 gives the following instruction to patent examiners: 
"If the applicant has asserted that the claimed invention is useful for any particular practical 
purpose ... and the assertion would be considered credible by a person of ordinary skill in the 
art, do not impose a rejection based on lack of utility." 

B. Proper Application of the Legal Standard 

Appellants respectfully submit that Appellants rely on the gene amplification data for 
patentable utility of the claimed PR0269 polypeptides, and that the gene amplification data for , 
the gene encoding the PR0269 polypeptide is clearly disclosed in the instant specification under 
Example 92. 

It was well known in the art at the time the invention was made that gene amplification is 
an essential mechanism for oncogene activation. The gene amplification assay is well-described 
in Example 92 of the present application. Example 92 discloses that the inventors isolated 
genomic DNA from a variety of primary cancers and cancer cell lines that are listed in Table 9, 
including primary lung tumors of the type and stage indicated in Table 8. As a negative control, 
DNA was isolated from the cells of ten normal healthy individuals, which was pooled and used 
as a control. Gene amplification was monitored using real-time quantitative TaqMan™ PCR. 
Table 9 shows the resulting gene amplification data. Further, Example 92 explains that the 
results of TaqMan™ PCR are reported in ACt units, wherein one unit corresponds to one PCR 
cycle or approximately a 2-fold amplification relative to control, two units correspond to 4-fold 
amplification, 3 units to 8-fold amplification etc. 

Appellants respectfully submit that a ACt value of at least 1.0 was observed for PR0269 
in at least 8 of the tumors and tumor cell lines listed in Table 9. PR0269 showed approximately 
1 to 2 ACt units which corresponds to 2 1 04 to 2 1 80 - fold amplification or 2.056 to 3.482 fold 
amplification in primary lung tumors. Accordingly, the present specification clearly discloses 

16 M.P.E.P. §2107.01. 

17 M.P.E.P. §2107 II (B)(1). 
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overwhelming evidence that the gene encoding the PR0269 polypeptide is significantly 
amplified in a number of lung tumors. 

It is also well known that gene amplification occurs in most solid tumors, and generally is 
associated with poor prognosis. 

In support, Appellants have submitted, in their Response filed February 21, 2003, a 

Declaration by Dr. Audrey Goddard. Appellants particularly draw the Board's attention to page 3 

of the Goddard Declaration which clearly states that: 

It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal {i.e., non-tumor) 
sample is significant and useful in that the detected increase in gene copy 
number in the tumor sample relative to the normal sample serves as a basis for 
using relative gene copy number as quantitated by the TaqMan PCR technique 
as a diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 
2-fold by the quantitative TaqMan PCR assay in a tumor sample relative to a 
normal sample is useful as a marker for the diagnosis of cancer, for 
monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. (Emphasis added). 

As indicated above, the gene encoding the PR0269 polypeptide shows at least a two fold 
amplification in the lung tumors tested. In addition, the Goddard Declaration clearly establishes 
that the TaqMan real-time PCR method described in Example 92 has gained wide recognition for 
its versatility, sensitivity and accuracy, and is in extensive use for the study of gene amplification. 
The facts disclosed in the Declaration also confirm that based upon the gene amplification 
results, one of ordinary skill would find it credible that PR0269 is a diagnostic marker of lung 
cancer. 

Thus data relating to PR0269 polypeptide expression may be used for the same 
diagnostic and prognostic purposes as data relating to PR0269 gene expression. Example 92 in 
the specification further discloses, "Amplification is associated with overexpression of the gene 
product, indicating that the polypeptides are useful targets for therapeutic intervention in certain 
cancers such as colon, lung, breast and other cancers and diagnostic determination of the 
presence of those cancers." Accordingly, based on the Declaration by Dr. Goddard and the 
teachings in the specification, one of ordinary skill would find it credible that the PRQ269 
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polypeptide is a useful target as a cancer marker for diagnostic determination of the presence of 
lung tumors . 

Appellants' position is based on the overwhelming evidence from gene amplification data 
disclosed in the specification which clearly indicate that the gene encoding PR0269 is 
significantly amplified. Based on the working hypothesis among those skilled in the art that if a 
gene is amplified in cancer, the encoded protein is likely to be expressed at an elevated level, one 
skilled in the art would simply accept that since the PR0269 gene is amplified, the PR0269 
polypeptide would be more likely than not over-expressed. Accordingly, based on the disclosure 
in the specification, no additional experiments would be necessary to determine how to use the 
claimed polypeptide, because the current invention is fully enabled by the disclosure of the 
present application. 

Accordingly, Appellants submit that based on the general knowledge in the art at the time 
the invention was made and the teachings in the specification, the specification provides clear 
guidance as to how to interpret and use the data relating to the PR0269 polypeptide expression 
and that the PR0269 polypeptides have utility in the diagnosis of cancer. 

C. A prima facie case of lack of utility has not been established 

As a preliminary matter, Appellants respectfully submit that it is not a legal requirement 
to establish a necessary correlation between an increase in the copy number of the DNA and 
protein expression levels that would correlate to the disease state, nor is it imperative to find 
evidence that DNA amplification is "always" associated with overexpression of the gene product. 
As discussed above, the evidentiary standard to be used throughout ex parte examination of a 
patent application is a preponderance of the totality of the evidence under consideration. 
Accordingly, the question is not whether a necessary or even strong correlation between an 
increase in copy number and protein expression levels exists, but whether it is more likely than 
not that a person of ordinary skill in the pertinent art would recognize such a positive correlation. 
Appellants respectfully submit that when the proper evidentiary standard is applied, a correlation 
must be acknowledged. 
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The Examiner cited Konopka et al to establish that protein expression is not related to 
gene amplification". Appellants submit that the PTO has generalized a very specific result 
disclosed by Konopka et al to cover all genes. Konopka et al actually state that "[p]rotein 
expression is not related to amplification of the abl gene but to variation in the level of bcr-abl 
mRNA produced from a single Ph 1 template." (See Konopka et al, Abstract, emphasis added). 
The paper does not teach anything whatsoever about the correlation of protein expression and 
gene amplification in general , and provides no basis for the generalization that apparently 
underlies the present rejection. The statement of Konopka et al that "[p]rotein expression is not 
related to amplification of the abl gene . . ." is not sufficient to establish a prima facie case of 
lack of utility. It is not enough to show that for a particular gene a correlation does not exist. 
The law requires that the Examiner show evidence that it is more likely than not that such 
correlation, in general, does not exist. Such a showing has not been made. 

The Examiner also cited the abstract of Pennica et al as providing evidence showing lack 
of correlation between gene (DNA) amplification and elevated mRNA levels. The standard, 
however, is not absolute certainty. The fact that in the case of a specific class of closely related 
molecules there seemed to be no correlation with gene amplification and the level of 
mRNA/protein expression, does not establish that it is more likely than not, in general, that such 
correlation does not exist. The Examiner has not shown whether the lack of correlation observed 
for the family of WISP polypeptides is typical, or is merely a discrepancy, an exception to the 
rule of correlation . Indeed, the working hypothesis among those skilled in the art is that, if a 
gene is amplified in cancer, the encoded protein is likely to be expressed at an elevated level. In 
fact, as noted even in Pennica et al, "[a]n analysis of WISP A gene amplification and expression 
in human colon tumors showed a correlation between DNA amplification and over-expression . . 
. ." (Pennica et al. 9 pagel4722, left column, first full paragraph, emphasis added). Accordingly, 
Appellants respectfully submit that Pennica et al teaches nothing conclusive regarding the 
absence of correlation between amplification of a gene and over-expression of the encoded WISP 
polypeptide. More importantly, the teaching of Pennica et al is specific to WISP genes. Pennica 
et al has no teaching whatsoever about the correlation of gene amplification and protein 
expression in general . 
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The Examiner has cited Haynes et al as providing evidence that polypeptide levels 
cannot be accurately predicted from mRNA levels. Contrary to the Examiner's reading, Haynes 
et al teaches that "there was a general trend but no strong correlation between protein 
[expression] and transcript levels" (Emphasis added). For example, in Figure 1, there is a 
positive correlation between mRNA and protein levels amongst most of the 80 yeast proteins 
studied. In fact, very few data points deviated or scattered away from the expected normal and 
no data points showed a negative correlation between mRNA and protein levels (i.e. an increase 
in mRNA resulted in a decrease in protein levels). Appellants further note that Haynes et al was 
studying yeast cells and not human cells. Haynes et al notes that their analysis focused on the 80 
most abundant proteins in the yeast lysate (page 1867). Haynes et al states "since many 
important regulatory proteins are present only at low abundance, these would not be amenable to 
analysis" (page 1867). Further, Haynes et al compared the protein expression levels of these 
naturally abundant proteins to mRNA expression levels from published SAGE frequency tables, 
(page 1863). Accordingly, Haynes et al did not compare mRNA expression levels and protein 
levels in the same yeast cells. Thus the analysis by Haynes et al is not applicable to the present 
application. 

The Patent Office has failed to meet its initial burden of proof that Appellants' claims of 
utility are not substantial or credible. The arguments presented by the Examiner in combination 
with the Pennica et al, Konopka et al and Haynes et al articles do not provide sufficient reasons 
to doubt the statements by Appellants that PR0269 has utility. As discussed above, the law does 
not require the existence of a strong or linear correlation between mRNA and protein levels. Nor 
does the law require that DNA amplification is "always" associated with overexpression of the 
gene product.. Therefore, Appellants submit that the Examiner's reasoning is based on a 
misrepresentation of the scientific data presented in the above cited references and application of 
an improper, heightened legal standard. In fact, contrary to what the Examiner contends, the art 
indicates that, if a gene is amplified in cancer, it is more likely than not that the encoded protein 
will be expressed at an elevated level. 



-14- 



Appeal Brief 
Application Serial No. 09/904,766 
Attorney's Docket No. 39780-1618P2C33 




D. It is "more likely than not" for amplified genes to have increased mRNA and 
protein levels 

Appellants have submitted ample evidence to show that, in general, if a gene is amplified 
in cancer, it is more likely than not that the encoded protein will be expressed at an elevated 
level. First, the articles by Orntoft et al., Hyman et al 9 and Pollack et al, (made of record in 
Appellants' Response filed November 3, 2004) collectively teach that in general gene 
amplification increases mRNA expression . Second, the Declaration of Dr. Paul Polakis, (made 
of record in Appellants' Response filed November 3, 2004) principal investigator of the Tumor 
Antigen Project of Genentech, Inc., the assignee of the present application, shows that, in 
general there is a correlation between mRNA levels and polypeptide levels . Thus, taken 
together, all of the submitted evidence supports Appellants' position that gene amplification is 
more likely than not predictive of increased mRNA and polypeptide levels. 

Appellants submit that generally, if a gene is amplified in cancer, it is more likely than 
not that the encoded protein will be expressed at an elevated level For example, Orntoft et al 
studied transcript levels of 5600 genes in malignant bladder cancers many of which were linked 
to the gain or loss of chromosomal material using an array-based method. Orntoft et al showed 
that there was a gene dosage effect and taught that "in general (18 of 23 cases) chromosomal 
areas with more than 2-fold gain of DNA showed a corresponding increase in mRNA transcripts" 
(Column 1, abstract). 

In addition, Hyman et al showed, using CGH analysis on cDNA microarrays which 
compared DNA copy numbers and mRNA expression of over 12,000 genes in breast cancer 
tumors and cell lines, that there was "evidence of a prominent global influence of copy number 
changes on gene expression levels." (Page 6244, column 1, last paragraph). 

Additional supportive teachings were also provided by Pollack et al, who studied a series 
of primary human breast tumors and showed that "62% of highly amplified genes show 
moderately or highly elevated expression, and DNA copy number influences gene expression 
across a wide range of DNA copy number alterations (deletion, low-, mid- and high-level 
amplification), and that on average, a 2-fold change in DNA copy number is associated with a 
corresponding 1.5 -fold change in mRNA levels ." (emphasis added) Thus, these articles 
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collectively teach that in general, gene amplification increases mRNA expression. 

The Examiner states that Orntoft et al do not appear to look at gene amplification, 
mRNA levels and polypeptide levels from a single gene at a time. "Orntoft et al concentrated on 
regions of chromosomes with strong gains of chromosomal material containing clusters of 
gene. . .It is not clear whether or not PR0269 is in a gene cluster in a region of a chromosome that 
is highly amplified. Therefore the relevance of Orntoft et al is allegedly not clear." (Page 4 of 
the Office Action mailed January 13, 2005) 

The Examiner states that "Hyman et al used the same CGH approach in their research. 
Less than half of highly amplified genes showed mRNA overexpression (abstract). Polypeptide 
levels were not investigated. Therefore Hyman et al also do not support utility of the claimed 
polypeptides" (Page 4 of the Office Action mailed January 13, 2005). The Examiner states that 
"Pollack et al also used CGH technology concentrating on large chromosome regions showing 
high amplification. Polypeptide levels were not investigated. Therefore Pollack et al also 
allegedly do not support the asserted utility of the claimed invention" (Page 4 of the Office 
Action mailed January 13, 2005). 

Appellants respectively point out that in Orntoft et al, 1,800 genes that yielded an 
increase or decrease in mRNA expression in two invasive tumors compared to the two non- 
invasive papillomas were then mapped to chromosomal locations. The chromosomes had 
already been analyzed for amplification by hybridizing tumor DNA to normal metaphase 
chromosomes (CGH). Orntoft et al used CGH alterations as the independent variable and 
estimated the frequency of expression alterations of the 1,800 genes in the chromosomal areas. 
Orntoft et al found that in general (77% and 80% concordance) areas with a strong gain of 
chromosomal material contained a cluster of genes having increased mRNA expression (page 
40). Orntoft et al states "For both tumors TCC733 (p<0.015) and TCC827 (p<0.00003) a highly 
significant correlation was observed between the level of CGH ratio change (reflecting the DNA 
copy number) and alterations detected by the array based technology" (page 41, col. 1). Orntoft 
et al, also studied the relation between altered mRNA and protein levels using 2D-PAGE 
analysis. Orntoft et al, states "In general there was a highly significant correlation (p<0.005) 
between mRNA and protein alterations. . .26 well focused proteins whose genes had a known 
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chromosomal location were detected in TCCs 733 and 335, and of these 19 correlated (p<0.005) 
with the mRNA changes detected using the arrays." Clearly Orntoft et al support Appellants 
position that proteins expressed by genes that are amplified in tumors are useful as cancer 
markers. 

The Examiner has stated that Appellants have not indicated whether PR0269 is in a gene 
cluster region of a chromosome. (Page 4 of the Office Action mailed January 13 5 2005). 
Appellants fail to see how this is relevant to the analysis. Orntoft et al did not limit their 
findings to only those regions of amplified gene clusters. Further, as discussed below, Hyman et 
al and Pollack et al. did gene-by-gene analysis across all chromosomes. 

Appellants respectively submit that the Examiner has mischaracterized the methods used 
by Hyman et al and Pollack et al in their analysis. These papers did not use traditional CGH 
analysis to identify amplified genes. In Hyman et al, 13,824 cDNA clones were placed on glass 
slides in a microarray and genomic DNA from breast cancer cell lines and normal human WBCs 
were hybridized to the cDNA sequences. For expression analysis , RNA from tumor cell lines 
were hybridized on the same microarrays. The 13,824 arrayed cDNA clones were analyzed for 
gene expression and gene copy number in 14 breast cancer cell lines. Hyman et al states that the 
cDNA/CGH microarray technique enables the direct correlation of copy number and expression 
data on a gene-by-gene basis throughout the genome (page 6242, col. 2). Hyman et al state, 
"The results illustrate a considerable influence of copy number on gene expression patterns." For 
example, Hyman et al teach that "[u]p to 44% of the highly amplified transcripts (CGH ratio, 
>2.5) were overexpressed (i.e., belonged to the global upper 7% of expression ratios) compared 
with only 6% for genes with normal copy number." (See page 6242, column 1). Further, Hyman 
et al state that "[t]he cDNA/CGH microarray technique enables the direct correlation of copy 
number and expression data on a gene-by-gene basis throughout the genome." (See page 6242, 
column 2). Therefore, the analysis performed by Hyman et al was on a gene-by gene basis, and 
clearly shows that "it is more likely than not" that a gene which is amplified in tumor cells will 
have increased gene expression. 

In Pollack et al, DNA copy number alteration across 6,691 mapped human genes in 44 
predominantly advanced primary breast tumors and 10 breast cancer cell lines was profiled. 
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Pollack et al further state, "Parallel microarray measurements of mRNA levels reveal the 
remarkable degree to which variation in gene copy number contributes to variation in gene 
expression in tumor cells." (Abstract). "Genome-wide, of 1 17 high-level DNA amplifications 
(fluorescence ratios >4, and representing 91 different genes), 62% (representing 54 different 
genes; . . .) are found associated with at least moderately elevated mRNA levels (mean-centered 
fluorescence ratios >2), and 42% (representing 36 different genes) are found associated with 
comparably highly elevated mRNA levels (mean-centered fluorescence ratios >4)" (page 12966). 
Therefore, the analysis performed by Pollack et al was also on a gene-by gene basis, and clearly 
shows that "it is more likely than not" that a gene which is amplified in tumor cells will have 
increased gene expression. 

With regard to the correlation between mRNA expression and protein levels, Appellants 
submitted a Declaration by Dr. Polakis, principal investigator of the Tumor Antigen Project of 
Genentech, Inc., the assignee of the present application, to show that mRNA expression 
correlates well with protein levels, in general. As Dr. Polakis explains, the primary focus of the 
microarray project was to identify tumor cell markers useful as targets for both the diagnosis and 
treatment of cancer in humans. The scientists working on the project extensively rely on results 
of microarray experiments in their effort to identify such markers. As Dr. Polakis explains, using 
microarray analysis, Genentech scientists have identified approximately 200 gene transcripts 
(mRNAs) that are present in human tumor cells at significantly higher levels than in 
corresponding normal human cells. To date, they have generated antibodies that bind to about 30 
of the tumor antigen proteins expressed from these differentially expressed gene transcripts and 
have used these antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. Having compared 
the levels of mRNA and protein in both the tumor and normal cells analyzed, they found a very 
good correlation between mRNA and corresponding protein levels. Specifically, in 
approximately 80% of their observations they have found that increases in the level of a 
particular mRNA correlates with changes in the level of protein expressed from that mRNA. 

While the proper legal standard is to show that the existence of correlation between 
mRNA and polypeptide levels is more likely than not, the showing of approximately 80% 
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correlation for the molecules tested in the Polakis Declaration greatly exceed this legal standard. 
Based on these experimental data and his vast scientific experience of more than 20 years, Dr. 
Polakis states that, for human genes, increased mRNA levels typically correlate with an increase 
in abundance of the encoded protein. He further confirms that "it remains a central dogma in 
molecular biology that increased mRNA levels are predictive of corresponding increased levels 
of the encoded protein." 

With regard to the correlation between mRNA expression and protein levels, the 
Examiner has asserted that the Polakis Declaration is insufficient to overcome the rejection of 
claims 39-43 since it is limited to a discussion of data regarding the correlation of mRNA levels 
and polypeptide levels and not gene amplification levels. The Examiner further asserted that the 
declaration does not provide data such that the Examiner can independently draw conclusions. 
(Page 6 of the Office Action mailed January 13, 2005). 

Appellants submit that Dr. Polakis 1 Declaration was presented to support the position that 
there is a correlation between mRNA levels and polypeptide levels, the correlation between gene 
amplification and mRNA levels having already been established by the data shown in the Orntoft 
et al, Hyman et al., and Pollack et al. articles. Appellants emphasize that the opinions expressed 
in the Polakis Declaration, including the quoted statement, are all based on factual findings. 
Thus, Dr. Polakis explains that in the course of their research using microarray analysis, he and 
his co-workers identified approximately 200 gene transcripts that are present in human tumor 
cells at significantly higher levels than in corresponding normal human cells. Subsequently, 
antibodies binding to about 30 of these tumor antigens were prepared, and mRNA and protein 
levels were compared. In approximately 80% of the cases, the researchers found that increases in 
the level of a particular mRNA correlated with changes in the level of protein expressed from 
that mRNA when human tumor cells are compared with their corresponding normal cells. Dr. 
Polakis' statement that "an increased level of mRNA in a tumor cell relative to a normal cell 
typically correlates to a similar increase in abundance of the encoded protein in the tumor cell 
relative to the normal cell" is based on factual, experimental findings, clearly set forth in the 
Declaration. Accordingly, the Declaration is not merely conclusive, and the fact-based 
conclusions of Dr. Polakis would be considered reasonable and accurate by one skilled in the art. 
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The case law has clearly established that in considering affidavit evidence, the Examiner 
must consider all of the evidence of record anew. 18 "After evidence or argument is submitted by 
the applicant in response, patentability is determined on the totality of the record, by a 
preponderance of the evidence with due consideration to persuasiveness of argument" 19 
Furthermore, the Federal Court of Appeals held in In re Alton, "We are aware of no reason why 

20 

opinion evidence relating to a fact issue should not be considered by an examiner" . Appellants 

2 1 * 

also respectfully draw the Examiner's attention to the Utility Examination Guidelines which 
state, "Office personnel must accept an opinion from a qualified expert that is based upon 
relevant facts whose accuracy is not being questioned; it is improper to disregard the opinion 
solely because of a disagreement over the significance or meaning of the facts offered." The 
statement in question from an expert in the field (the Polakis Declaration) states that "it is my 
considered scientific opinion that for human genes, an increased level of mRNA in a tumor cell 
relative to a normal cell typically correlates to a similar increase in abundance of the encoded 
protein in the tumor cell relative to the normal cell." Therefore, barring evidence to the contrary 
regarding the above statement in the Polakis Declaration, this rejection is improper under both 
the case law and the Utility guidelines. 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is a correlation between polypeptide and 
mRNA levels, these instances are exceptions rather than the rule. In the majority of amplified 
genes , the teachings in the art, as exemplified by Orntoft et al, Hyman et al, Pollack et al, and 
the Polakis Declaration, overwhelmingly show that gene amplification influences gene 
expression at the mRNA and protein levels. Therefore, one of skill in the art would reasonably 
expect in this instance, based on the amplification data for the PR0269 gene, that the PR0269 

18 In reRinehart, 531 F.2d 1084, 189 USPQ 143 (C.C.P.A. 1976) mdln rePiasecki, 745 F.2d. 1015, 226 
USPQ881 (Fed. Cir. 1985). 

19 In re Alton, 37 USPQ2d 1578 (Fed. Cir 1966) at 1584 quoting In re Oetiker, 977 F.2d 1443, 1445, 24 
USPQ2d 1443, 1444 (Fed. Cir. 1992)). 

20 In re Alton, supra. 

21 Part IIB, 66 Fed. Reg. 1098 (2001). 
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polypeptide is concomitantly overexpressed. Thus, Appellants submit that the PR0269 
polypeptides and antibodies have utility in the diagnosis of cancer and based on such a utility, 
one of skill in the art would know exactly how to use the antibody for diagnosis of cancer. 

The Examiner cites Hu et al for support that genes displaying a 5-fold change or less in 
tumors compared to normal showed no evidence of a correlation between altered gene expression 
and a known role in the disease. However, among genes with a 10- fold or more change in 
expression level, there was a strong and significant correlation between expression level and a 
published role in the disease. 

Appellants disagree with the applicability of Hu et al in this case. Appellants note that 
Hu et al only studies the statistical analysis of micro-array data and not gene amplification data. 
Therefore, their findings would not be directly applicable to the gene amplification data. In 
addition, Appellants respectfully submit that the Hu et al reference does not show a lack of 
correlation between microarray data and the biological significance of cancer genes. 

First, the analysis by Hu et al has certain statistical flaws. According to Hu et al, 
"different statistical methods" were applied to "estimate the strength of gene-disease relationships 
and evaluated the results." (Page 406, left column, emphasis added). Using these different 
statistical methods, Hu et al "[assessed the relative strengths of gene-disease relationships based 
on the frequency of both co-citation and single citation." (Page 41 1, left column). It is well 
known in the art that various statistical methods allow different variables to be manipulated to 
affect the outcome. For example, the authors admit, "Initial attempts to search the literature 
using" the list of genes, gene names, gene symbols, and frequently used synonyms, generated by 
the authors "revealed several sources of false positives and false negatives." (Page 406, right 
column). The authors further admit that the false positives caused by "duplicative and unrelated 
meanings for the term" were "difficult to manage." Therefore, in order to minimize such false 
positives, Hu et al disclose that these terms "had to be eliminated entirely, thereby reducing the 
false positive rate but unavoidably under-representing some genes. " Id. Hence, Appellants 
respectfully submit that in order to minimize the false positives and negatives in their analysis, 
Hu et al manipulated various aspects of the input data. 

Secondly, Appellants submit that the statistical analysis by Hu et al is not a reliable 

-21- 

Appeal Brief 
Application Serial No. 09/904,766 
Attorney's Docket No. 39780-1618P2C33 



standard because the frequency of citation only reflects the current research interest of a molecule 
but not the true biological function of the molecule. Indeed, the authors acknowledge that 
"[relationship established by frequency of co-citation do not necessarily represent a true 
biological link." (Page 411, right column). It often happens in the scientific study that important 
molecules are overlooked by the scientific society for many years until the discovery of their true 
function. Therefore, Appellants submit that Hu et al drew their conclusions based on a very 
unreliable standard and their research does not provide any meaningful information regarding the 
correlation between the microarray data and the biological significance. 

Even assuming that Hu et al provide evidence to support a true relationship, the 
conclusion in Hu et al only applies to a specific type of breast tumor (estrogen receptor (Ex- 
positive breast tumor) and can not be generalized as a principle governing microarray study of 
breast cancer in general, let alone the various other types of cancer genes in general . In fact, even 
Hu et al admit that , "[i]t is likely that this threshold will change depending on the disease as well 
as the experiment. Interestingly, the observed correlation was only found among ER-positive 
(breast) tumors not ER-negative tumors." (Page 412, left column). Therefore, based on these 
findings, the authors add, "[t]his may reflect a bias in the literature to study the more prevalent 
type of tumor in the population. Furthermore, this emphasizes that caution must be taken when 
interpreting experiments that may contain subpopulations that behave very differently." Id. 
(Emphasis added). 

The Examiner cites Hanna and Mornin as showing that gene amplification does not 
reliably correlate with polypeptide over-expression and thus the level of polypeptide expression 
must be tested empirically (page 6 of the Advisory Action mailed March 30, 2005). 

Appellants disagree. Hanna and Mornin describe HER-2/neu Breast cancer predictive 
testing methods which have been FDA approved: immunohistochemistry and fluorescent in situ 
hybridization. While Hanna and Mornin indicate that some subsets of tumors were found 
lacking protein overexpression with gene amplification, Hanna and Mornin state that "in general, 
FISH and IHC results correlate well." (Column 2) Accordingly, it is more likely than not that 
protein expression with correlate with gene amplification. 

Appellants further submitted the Declaration by Avi Ashkenazi, Ph.D., an expert in the 
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field of cancer biology and a Director of the Molecular Oncology Department at Genentech, Inc., 
the assignee of the present application. In his Declaration, Dr. Ashkenazi states: 

"If gene amplification results in over-expression of the mRNA and corresponding gene 
product, then it identifies that gene product as a promising target for cancer therapy, for 
example by the therapeutic antibody approach. " 

In summary, Appellants respectfully submit that the Examiner has not shown a lack of 
correlation between gene amplification data and the biological significance of cancer genes. On 
the other hand, Appellants have clearly demonstrated a credible, specific and substantial asserted 
utility for the PR0269 polypeptide. 

E. Even if a prima facie case of lack of utility has been established, it should be 
withdrawn on consideration of the totality of evidence 

Even if one assumes arguendo that it is more likely than not that there is no correlation 

between gene amplification and increased mRNA/protein expression, which Appellants submit is 

not true, a polypeptide encoded by a gene that is amplified in cancer would still have a specific, 

substantial, and credible utility. In support, Appellants respectfully draw the Board's attention to 

page 2 of the Declaration of Dr. Avi Ashkenazi (submitted with the Response filed May 21, 

2004) which explains that, 

even when amplification of a cancer marker gene does not result in significant 
over-expression of the corresponding gene product, this very absence of gene 
product over-expression still provides significant information for cancer diagnosis 
and treatment. Thus, if over-expression of the gene product does not parallel gene 
amplification in certain tumor types but does so in others, then parallel monitoring 
of gene amplification and gene product over-expression enables more accurate 
tumor classification and hence better determination of suitable therapy. In 
addition, absence of over-expression is crucial information for the practicing 
clinician. If a gene is amplified but the corresponding gene product is not over- 
expressed, the clinician accordingly will decide not to treat a patient with agents 
that target that gene product. 

Appellants thus submit that simultaneous testing of gene amplification and gene product 
over-expression enables more accurate tumor classification, even if the gene-product, the protein, 
is not over-expressed. This leads to better determination of a suitable therapy. Further, as 
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explained in Dr. Ashkenazi's Declaration, absence of over-expression of the protein itself is 
crucial information for the practicing clinician. If a gene is amplified in a tumor, but the 
corresponding gene product is not over-expressed, the clinician will decide not to treat a patient 
with agents that target that gene product. This not only saves money, but also has the benefit that 
the patient can avoid exposure to the side effects associated with such agents. 

Appellants have clearly shown that the gene encoding the PR0269 polypeptide is 
amplified in at least 8 primary lung tumors and lung tumor cell lines. Therefore, the PR0269 
gene is a tumor associated gene. Furthermore, as discussed above, in the majority of amplified 
genes, the teachings in the art overwhelmingly show that gene amplification influences gene 
expression at the mRNA and protein levels. Therefore, one of skill in the art would reasonably 
expect in this instance, based on the amplification data for the PR0269 gene, that the PR0269 
polypeptide is concomitantly overexpressed. 

However, even if gene amplification does not result in overexpression of the gene product 
(i.e., the protein) an analysis of the expression of the protein is useful in determining the course 
of treatment, as supported by the Ashkenazi Declaration. The Examiner appears to view the 
testing described in the Ashkenazi Declaration as experiments involving further characterization 
of the PR0269 polypeptide itself. In fact, such testing is for the purpose of characterizing not the 
PR0269 polypeptide, but the tumors in which the gene encoding PR0269 is amplified. The 
PR0269 polypeptide and antibodies which specifically bind thereto are therefore useful in tumor 
categorization, the results of which become an important tool in the hands of a physician 
enabling the selection of a treatment modality that holds the most promise for the successful 
treatment of a patient. 

For the reasons given above, Appellants respectfully submit that the present specification 
clearly describes, details and provides a patentable utility for the claimed invention. 
Accordingly, Appellants respectfully request reconsideration and reversal of the rejections of 
Claims 44-46 and 49-52 under 35 U.S.C. §101. 
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ISSUE II: Claims 44-46 and 49-52 satisfy the enablement requirement of 35 USC $112, 
first paragraph. 

Claims 44-46 and 49-52 stand rejected under 35 U.S.C. §112, first paragraph, allegedly 
"since the claimed invention is not supported by either a specific and substantial asserted utility or a 
well established utility for the reasons set forth above, one skilled in the art clearly would not know 
how to use the claimed invention." (Page 2 of the Office Action mailed January 13, 2005). 

In this regard, Appellants refer to the arguments and information presented above in 
response to the outstanding rejection under 35 U.S.C. § 101, wherein those arguments are 
incorporated by reference herein. Appellants respectfully submit that as described above, the 
PR0269 polypeptides have utility in the diagnosis of cancer and based on such a utility, one of 
skill in the art would know exactly how to use the claimed polypeptides for diagnosis of cancer, 
without undue experimentation. 

Accordingly, Appellants respectfully request reconsideration and reversal of the 
enablement rejection of Claims 44-46 and 49-52 under 35 U.S.C. §1 12, first paragraph. 
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9. CONCLUSION 

For the reasons given above, Appellants submit that the specification discloses at least 
one patentable utility for PR0269 polypeptides of Claims 44-46 and 49-52, and that one of 
ordinary skill in the art would understand how to used the claimed antibodies in the diagnosis of 
lung tumors. Therefore, claims 44-46 and 49-52 meet the requirements of 35 USC §101. 

Accordingly, reversal of all the rejections of claims 44-46 and 49-52 is respectfully 
requested. 

Please charge any additional fees, including fees for additional extension of time, or 
credit overpayment to Deposit Account No. 08-1641 (referencing Attorney's Docket 
No. 39780-1618 P2C33) . 



HELLER EHRMAN LLP 

275 Middlefield Road 
Menlo Park, California 94025-3506 
Telephone: (650) 324-7000 
Facsimile: (650) 324-0638 



Respectfully submitted, 



Date: September 13, 2005 



Leslie A. Mooi (Reg. No. 37,047) 
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APPENDIX A 
Claims on Appeal 

44. An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 96; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 96, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the extracellular domain of the polypeptide of SEQ ID 
NO: 96; or 

(d) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209397. 

45. The isolated polypeptide of Claim 44 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO: 96. 

46. The isolated polypeptide of Claim 44 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO: 96, lacking its associated signal peptide. 

49. The isolated polypeptide of Claim 44 comprising the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 209397. 

50. A chimeric polypeptide comprising a polypeptide according to Claim 44 fused to a 
heterologous polypeptide. 

51. The chimeric polypeptide of Claim 50, wherein said heterologous polypeptide is an 
epitope tag or an Fc region of an immunoglobulin. 

52. The isolated polypeptide of Claim 44 comprising the amino acid sequence of the 
extracellular domain of the polypeptide of SEQ ID NO: 96. 
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9. EVIDENCE APPENDIX 

1 . Declaration of Audrey Goddard, Ph.D. under 35 C.RR. 1.132. 

2. Declaration of Avi Ashkenazi, Ph.D. under 35 C.F.R. 1.132 

3. Declaration of Paul Polakis, Ph.D. under 35 C.F.R. 1.132 

4. Orntoft et al, Mol and Cell Proteomics, 1 :37-45 (2002) 

5. Hyman et al Cancer Res., 62:6240-45 (2002) 

6. Pollack et al, Proc. Nat. Acad. Set USA, 99: 12963-12968 (2002) 

7. Pennica et al,Proc. Natl Acad. Sci. 9 USA. 95: 14717-14722 (1998) 

8. Konopka et al, Proc. Natl Acad. Set USA 83:4049-4052 (1986) 

9. Haynes et al, Electrophoresis 19: 1862-1871 (1998). 

10. Hu et ai 9 Journal of Proteome Research 2:405-412 (2003). 

11. Hanna and Mornin, Pathology Associates Medical Laboratories (1999) 

Item 1 was submitted with Appellants response February 21, 2003. Item 2 was submitted with 
Appellants response May 21, 2004. Items 3-6 were submitted with Appellants' Response filed 
November 3, 2004, and noted as considered by the Examiner in the final Office Action mailed 
January 13, 2005. 

Items 7-9 were made of record by the Examiner in the Office Action mailed January 21, 2004. 
Item 10 was made of record by the Examiner in the Office Action mailed January 13, 2005. Item 
1 1 was submitted by the Appellant in their response filed May 21, 2004 and referenced by the 
Examiner in the Advisory Action mailed March 30, 2005. 
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DECLARATION OF AUDREY D. GODDARD, Ph.D UNDER 37 CF.R. § 1.132 

Assistant Commissioner of Patents 
Washington, D.C. 20231 

Sir: 

I, Audrey D. Goddard, Ph.D. do hereby declare and say as follows: 

1 . I am a Senior Clinical Scientist at the Experimental Medicine/BioOncology, Medical 
Affairs Department of Genentech, Inc., South San Francisco, California 94080. 

2. Between 1 993 and 200 1 , 1 headed the DNA Sequencing Laboratory at the Molecular 
Biology Department of Genentech, Inc. During this time, my responsibilities included the 
identification and characterization of genes contributing to the oncogenic process, and determination 
of the chromosomal localization of novel genes. 

3. My scientific Curriculum Vitae, including my list of publications, is attached to and 
forms part of this Declaration (Exhibit A). 




c 




Serial No.: * 
Filed: * 



4. 



I am familiar with a variety of techniques known in the art for detecting and 



quantifying the amplification of oncogenes in cancer, including the quantitative TaqMan PCR (i.e., 
"gene amplification") assay described in the above captioned patent application. 

5. The TaqMan PCR assay is described, for example, in the following scientific 
publications: Higuchi et al, Biotechnology 10:413-417 (1992) (Exhibit B); Livak et al, PCR 
Methods AppL 4:357-362 (1995) (Exhibit C) and Heid et al, Genome Res. 6:986-994 (1996) 
(Exhibit D). Briefly, the assay is based on the principle that successful PCR yields a fluorescent 
signal due to Taq DNA polymerase-mediated exonuclease digestion of a fluorescently labeled 
oligonucleotide that is homologous to a sequence between two PCR primers. The extent of 
digestion depends directly on the amount of PCR, and can be quantified accurately by measuring the 
increment in fluorescence that results from decreased energy transfer. This is an extremely sensitive 
technique, which allows detection in the exponential phase of the PCR reaction and, as a result, 
leads to accurate determination of gene copy number. 

6. The quantitative fluorescent TaqMan PCR assay has been extensively and 
successfully used to characterize genes involved in cancer development and progression. 
Amplification of protooncogenes has been studied in a variety of human tumors, and is widely 
considered as having etiological, diagnostic and prognostic significance. This use of the quantitative 
TaqMan PCR assay is exemplified by the following scientific publications: Pennica et al, Proc. 
Natl. Acad Sci. USA 95(25): 147 17-1 4722 (1998) (Exhibit E); Pitti et al, Nature 
396(6712);699-703 (1998) (Exhibit F) andBieche etal Ant J. Cancer 78:661-666 (1998) (Exhibit 
G), the first two of which I am co-author. In particular, Pennica et al have used the quantitative 
TaqMan PCR assay to study relative gene amplification of WISP and c-myc in various cell lines, 
colorectal tumors and normal mucosa. Pitti et al studied the genomic amplification of a decoy 
receptor for Fas ligand in lung and colon cancer, using the quantitative TaqMan PCR assay. Bieche 
et al used the assay to study gene amplification in breast cancer. 
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7. It is my personal experience that the quantitative TaqMan PCR technique is 
technically sensitive enough to detect at least a 2-fold increase in gene copy number relative to 
control. It is further my considered scientific opinion that an at least 2-fold increase in gene copy 
number in a tumor tissue sample relative to a normal (i.e., non-tumor) sample is significant and 
useful in that the detected increase in gene copy number in the tumor sample relative to the normal 
sample serves as a basis for using relative gene copy number as quantitated by the TaqMan PCR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue sample of unknown • 
pathology. Accordingly, a gene identified as being amplified at least 2-fold by the quantitative 
TaqMan PCR assay in a tumor sample relative to a normal sample is useful as a marker for the 
diagnosis of cancer, for monitoring cancer development and/or for measuring the efficacy of cancer 
therapy. 

8. i declare further that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true. I declare that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the validity of the application or any 
patent issuing thereon. 
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AUDREY D. GODDARD, Ph.D. 



Genentech, Inc. 
1 DNA Way 

South San Francisco, CA, 94080 
650.225.6429 
goddarda@gene.com 

PROFESSIONAL EXPERIENCE 

Genentech, Inc. 1993-present 
South San Francisco, CA 

2001 - present Senior Clinical Scientist 

Experimental Medicine / BioOncoIogy, Medical Affairs 

Responsibilities: 

• Companion diagnostic oncology products 

• Acquisition of clinical samples from Genentech's clinical trials for translational research 

• Translational research using clinical specimen and data for drug development and 
diagnostics 

• Member of Development Science Review Committee, Diagnostic Oversight Team, 21 CFR 
Part 1 1 Subteam 

Interests: 

• Ethical and legal implications of experiments with clinical specimens and data 

• Application of pharmacogenomics in clinical trials 



1998 - 2001 Senior Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities: 

• Management of a laboratory of up to nineteen -including postdoctoral fellow, associate 
scientist, senior research associate and research assistants/associate levels 

• Management of a $750K budget 

• DNA sequencing core facility supporting a 350+ person research facility. 

• DNA sequencing for high throughput gene discovery, ~ ESTs, cDNAs, and constructs 

• Genomic sequence analysis and gene identification 

• DNA sequence and primary protein analysis 

Research: 

• Chromosomal localization of novel genes 

• Identification and characterization of genes contributing to the oncogenic process 

• Identification and characterization of genes contributing to inflammatory diseases 

• Design and development of schemes for high throughput genomic DNA sequence analysis 

• Candidate gene prediction and evaluation 



110 Congo St. 

San Francisco, CA, 94131 

415.841.9154 

415.819.2247 (mobile) 

agoddard@pacbell.net 
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1993-1998 



Scientist 



Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities 

• DNA sequencing core facility supporting a 350+ person research facility 

• Assumed responsibility for a pre-existing team of five technicians and expanded the group 
into fifteen, introducing a level of middle management and additional areas of research 

• Participated in the development of the basic plan for high throughput secreted protein 
discovery program - sequencing strategies, data analysis and tracking, database design 

• High throughput EST and cDNA sequencing for new gene identification. 

• Design and implementation of analysis tools required for high throughput gene identification. 

• Chromosomal localization of genes encoding novel secreted proteins. 

Research: 

• Genomic sequence scanning for new gene discovery. 

• Development of signal peptide selection methods. 

• Evaluation of candidate disease genes. 

• Growth hormone receptor gene SNPs in children with Idiopathic short stature 

Imperial Cancer Research Fund 1989-1992 
London, UK with Dr. Ellen Solomon 

6/89-12/92 Postdoctoral Fellow 

• Cloning and characterization of the genes fused at the acute promyelocytic leukemia 
translocation breakpoints on chromosomes 17 and 15. 

• Prepared a successfully funded European Union multi-center grant application 

McMaster University 1983 
Hamilton, Ontario, Canada with Dr. G. D. Sweeney 

5/83 - 8/83: NSERC Summer Student 

• In vitro metabolism of p-naphthoflavone in C57BI/6J and DBA mice 



EDUCATION 



Ph.D. 



University of Toronto 
Toronto, Ontario, Canada. 
Department of Medical 
Biophysics. 



"Phenotypic and genotypic effects of mutations in 
the human retinoblastoma gene." 
Supervisor: Dr. R. A. Phillips 



1989 



Honours B.Sc 

"The in vitro metabolism of the cytochrome P-448 
inducer p-naphthoflavone in C57BL/6J mice." 
Supervisor: Dr. G. D. Sweeney 



McMaster University, 
Hamilton, Ontario, Canada. 
Department of Biochemistry 



1983 
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ACADEMIC AWARDS 



Imperial Cancer Research Fund Postdoctoral Fellowship 

Medical Research Council Studentship 

NSERC Undergraduate Summer Research Award 

Society of Chemical Industry Merit Award (Hons. Biochem.) 

Dr. Harry Lyman Hooker Scholarship 

J.LW. Gill Scholarship 

Business and Professional Women's Club Scholarship 
Wyerhauser Foundation Scholarship 



1981-1983 
1981-1982 
1980-1981 
1979-1980 



1989-1992 
1983-1988 
1983 



1983 



INVITED PRESENTATIONS 

Genentech's gene discovery pipeline: High throughput identification, cloning and 
characterization of novel genes. Functional Genomics: From Genome to Function, Litchfield 
Park, AZ f USA. October 2000 

High throughput identification, cloning and characterization of novel genes. G2K:Back to 
Science, Advances in Genome Biology and Technology I. Marco Island, FL, USA. February 



Quality control in DNA Sequencing: The use of Phred and Phrap. Bay Area Sequencing 
Users Meeting, Berkeley, CA, USA. April 1999 

High throughput secreted protein identification and cloning. Tenth International Genome 
Sequencing and Analysis Conference, Miami, FL, USA. September 1998 

The evolution of DNA sequencing: The Genentech perspective. Bay Area Sequencing Users 
Meeting, Berkeley, CA, USA. May 1998 

Partial Growth Hormone Insensitivity: The role of GH-receptor mutations in Idiopathic Short 
Stature. Tenth Annual National Cooperative Growth Study Investigators Meeting, San 
Francisco, CA, USA. October, 1996 

Growth hormone (GH) receptor defects are present in selected children with non-GH-deficient 
short stature: A molecular basis for partial GH-insensitivity. 76 th Annual Meeting of The 
Endocrine Society, Anaheim, CA, USA. June 1994 

A previously uncharacterized gene, myl, is fused to the retinoic acid receptor alpha gene in 
acute promyelocytic leukemia. XV International Association for Comparative Research on 
Leukemia and Related Disease, Padua, Italy. October 1991 



2000 



c 
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PATENTS 

Goddard A, Godowski PJ, Gurney AL. NL2 Tie ligand homologue polypeptide. Patent 
Number: 6,455,496. Date of Patent: Sept. 24, 2002. 

Goddard A, Godowski PJ and Gurney AL NL3 Tie ligand homologue nucleic acids. Patent 
Number: 6,426,218. Date of Patent: July 30, 2002. 

Godowski P, Gurney A, Hillan KJ, Botstein D, Goddard A, Roy M, Ferrara N, Tumas D, 
Schwall R. NL4 Tie ligand homologue nucleic acid. Patent Number: 6,4137,770. Date of 
Patent: July 2, 2002. 

Ashkenazi A, Fong S, Goddard A, Gurney AL, Napier MA, Tumas D, Wood Wl. Nucleic acid 
encoding A-33 related antigen poly peptides. Patent Number: 6,410,708. Date of Patent:: 
Jun. 25, 2002. 

Botstein DA, Cohen RL, Goddard AD, Gurney AL, Hillan .KJ, Lawrence DA, Levine AJ, 
Pennica D, Roy MA and Wood Wl. WISP polypeptides and nucleic acids encoding same. 
Patent Number: 6,387,657. Date of Patent: May 14, 2002. 

Goddard A, Godowski PJ and Gurney AL. Tie ligands. Patent Number: 6,372,491. Date of 
Patent: April 16, 2002. 

Godowski PJ, Gurney AL, Goddard A and Hillan K. TIE ligand homologue antibody. Patent 
Number: 6,350,450. Date of Patent: Feb. 26, 2002. 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Tie 
receptor tyrosine kinase ligand homologues. Patent Number: 6,348,351 . Date of Patent: 
Feb. 19, 2002. 

Goddard A, Godowski PJ and Gurney AL. Ligand homologues. Patent Number: 6,348,350. 
Date of Patent: Feb. 19, 2002. 

Attie KM, Carlsson LMS, Gesundheit N and Goddard A. Treatment of partial growth 
hormone insensitivity syndrome. Patent Number: 6,207,640. Date of Patent: March 27, 
2001. 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL, Hillan K and Williams PM. Nucleic 
acids encoding NL-3. Patent Number: 6,074,873. Date of P.atent: June 13, 2000 

Attie K, Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,824,642. Date of Patent: October 20, 1998 

Attie K, Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,646,1 13. Date of Patent: July 8, 1997 



Multiple additional provisional applications filed 



• f 
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PUBLICATIONS 



Seshasayee D, Dowd P, Gu Q, Erickson S, Goddard AD Comparative sequence analysis of 
the HER2 locus in mouse and man. Manuscript in preparation. 

Abuzzahab MJ, Goddard A, Grigorescu F, Lautier C, Smith RJ and Chernausek SD. Human 
IGF-1 receptor mutations resulting in pre- and post-natal growth retardation. Manuscript in 
preparation. 

Aggarwal S, Xie, M-H, Foster J, Frantz G, Stinson J, Corpuz RT, Simmons L t Hillan K, 
Yansura DG, Vandlen RL, Goddard AD and Gurney AL. FHFR, a novel receptor for the 
fibroblast growth factors. Manuscript submitted. 

Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC, Lee J, Dowd P, Colman 
S M Lewin DA. (2001) BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown 
adipose tissue: Cloning, organization of the human gene, and assessment of a potential link 
to obesity. Biochemical Journal 360: 1 35-1 42. 

Lee J. Ho WH. Maruoka M. Corpuz RT. Baldwin DT. Foster JS. Goddard AD. Yansura DG. 
Vandlen RL. Wood Wl. Gurney AL (2001) IL-17E, a novel proinflammatory ligand for the IL- 
17 receptor homolog IL-17Rh1. Journal of Biological Chemistry 276(2): 1660-1664. 

Xie M-H, Aggarwal S, Ho W-H, Foster J, Zhang Z, Stinson J, Wood Wl, Goddard AD and 
Gurney AL. (2000) Interleukin (IL)-22, a novel human cytokine that signals through the 
interferon-receptor related proteins CRF2-4 and IL-22R. Journal of Biological Chemistry 275: 
31335-31339. 

Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS. (2000) Rapid mapping of 
protein functional epitopes by combinatorial alanine scanning. Proc. Natl. Acad. ScL USA 97: 
8950-8954. 

Guo S f Yamaguchi Y, Schilbach S, Wada T.;Lee J, Goddard A, French D , Handa H, 
Rosenthal A. (2000) A regulator of transcriptional elongation controls vertebrate neuronal 
development. Nature 408: 366-369. 

Yan M, Wang L-C, Hymowitz SG, Schilbach S ( Lee J, Goddard A, de Vos AM, Gao WQ, Dixit 
VM. (2000) Two-amino acid molecular switch in an epithelial morphogen that regulates 
binding to two distinct receptors. Science 290: 523-527. 

Sehl PD, Tai JTN, Hillan KJ, Brown LA, Goddard A, Yang R, Jin H and Lowe DG. (2000) 
Application of cDNA microarrays in determining molecular phenotype in cardiac growth, 
development, and response to injury. Circulation 101: 1990-1999. 

Guo S, Brush J, Teraoka H, Goddard A, Wilson SW, Mullins MC and Rosenthal A. (1999) 
Development of noradrenergic neurons in the zebrafish hindbrain requires BMP, FGF8, and 
the homeodomain protein soulless/Phox2A. Neuron 24: 555-566. 

Stone D, Murone, M, Luoh, S, Ye W, Armanini P, Gurney A, Phillips HS, Brush, J, Goddard 
A, de Sauvage FJ and Rosenthal A. (1999) Characterization of the human suppressor of 
fused; a negative regulator of the zinc-finger transcription factor Gli. J. Cell ScL 112: 4437- 
4448. 

Xie M-H, Holcomb I, Deuel B, Dowd P, Huang A, Vagts A, Foster J, Liang J, Brush J, Gu Q, 
Hillan K, Goddard A and Gurney, AL. (1999) FGF-19, a novel fibroblast growth factor with 
unique specificity for FGFR4. Cytokine 11: 729-735. 
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Yan M, Lee J ( Schilbach S, Goddard A and Dixit V. (1999) mE10, a novel caspase 
recruitment domain-containing proapoptotic molecule. J. Biol. Chem. 274(15): 10287-10292. 

Gurney AL ? Marsters SA, Huang RM, Pitti RM f Mark DT, Baldwin DT, Gray AM, Dowd P, 
Brush J, Heldens S, Schow.P, Goddard AD, Wood.WI, Baker KP, Godowski PJ and 
Ashkenazi A. (1999) Identification of a new member of the tumor necrosis factor family and its 
receptor, a human ortholog of mouse GITR. Current Biology 9(4): 215-218. 

Ridgway JBB, Ng E, Kern JA ,Lee J, Brush J, Goddard A and Carter P. (1999) Identification 
of a human anti-CD55 single-chain Fv by subtractive panning of a phage library using tumor 
and nontumor cell lines. Cancer Research 59: 2718-2723. 

Pitti RM, Marsters SA, Lawrence DA, Roy M, Kischkel FC, Dowd P, Huang A, Donahue CJ, 
Sherwood SW, Baldwin DT, Godowski PJ, Wood Wl, Gurney AL, Hillan KJ, Cohen RL, 
Goddard AD, Botstein D and Ashkenazi A. (1998) Genomic amplification of a decoy receptor 
for Fas ligand in lung and colon cancer. Nature 396(6712): 699-703. 

Pennica D, Swanson TA, Welsh JW, Roy MA, Lawrence DA, Lee J, Brush J, Taneyhill LA, 
Deuel B, Lew M, Watanabe C, Cohen RL, Melhem MF, Finley GG, Quirke P, Goddard AD, 
Hillan KJ, Gurney AL, Botstein D and Levine AJ. (1998) WISP genes are members of the 
connective tissue growth factor family that are up-regulated in wnt-1 -transformed cells and 
aberrantly expressed in human colon tumors. Proc. Natl. Acad. Sc/. USA. 95(25): 14717- 
14722. 

Yang RB, Mark MR, Gray A, Huang A, Xie MH, Zhang M, Goddard A, Wood WI, Gurney AL 
and Godowski PJ. (1998) Toll-like receptor-2 mediates lipopolysaccharide-induced cellular 
signalling. Nature 395(6699): 284-288. 

Merchant AM, Zhu Z, Yuan JQ, Goddard A, Adams CW, Presta LG and Carter P. (1998) An 
efficient route to human bispecific IgG. Nature Biotechnology 16(7): 677-681 . 

Marsters SA, Sheridan JP, Pitti RM, Brush J, Goddard A and Ashkenazi A. (1998) 
Identification of a ligand for the death-domain-containing receptor Apo3. Current Biology 8(9): 
525-528. 

Xie J, Murone M, Luoh SM, Ryan A, Gu Q, Zhang C, Bonifas JM, Lam CW, Hynes M, 
Goddard A, Rosenthal A, Epstein EH Jr. and de Sauvage FJ. (1998) Activating Smoothened 
mutations in sporadic basal-cell carcinoma. Nature. 391(6662): 90-92. 

Marsters SA, Sheridan JP, Pitti RM, Huang A, Skubatch M, Baldwin D, Yuan J, Gurney A, 
Goddard AD, Godowski P and Ashkenazi A. (1997) A novel receptor for Apo2UTRAIL 
contains a truncated death domain. Current Biology. 7(12): 1003-1006. 

Hynes M, Stone DM, Dowd M, Pitts-Meek S, Goddard A, Gurney A and Rosenthal A. (1997) 
Control of cell pattern in the neural tube by the zinc finger transcription factor G/M. Neuron 
19: 15-26. 

Sheridan JP, Marsters SA, Pitti RM, Gurney A., Skubatch M, Baldwin D, Ramakrishnan L, 
Gray CL, Baker K, Wood Wl, Goddard AD, Godowski P, and Ashkenazi A. (1997) Control of 
TRAIL-lnduced Apoptosis by a Family of Signaling and Decoy Receptors. Science 277 
(5327): 818-821. 
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Goddard AD, Dowd P, Chemausek S, Geffner M, Gertner J, Hintz R, Hopwood N ( Kaplan S f 
Plotnick L ( Rogol A, Rosenfield R, Saenger P, Mauras N. Hershkopf R t Angulo M and Attie, K. 
(1997) Partial growth hormone insensitivity: The role of growth hormone receptor mutations in 
idiopathic short stature. J. Pediatr. 131: S51-55, 

Klein RD, Sherman D, Ho WH, Stone D, Bennett GL, Moffat B, Vandlen R, Simmons L, Gu Q t 
Hongo JA, Devaux B, Poulsen K, Armanini M, Nozaki C, Asai N, Goddard A, Phillips H, 
Henderson CE, Takahashi M and Rosenthal A. (1997) A GPMinked protein that interacts with 
Ret to form a candidate neurturin receptor. Nature. 387(6634): 717-21. 

Stone DM, Hynes M f Armanini M, Swanson TA, Gu Q, Johnson RL, Scott MP, Pennica D, 
Goddard A, Phillips H, Noll M, Hooper JE, de Sauvage F and Rosenthal A (1996) The 
tumour-suppressor gene patched encodes a candidate receptor for Sonic hedgehog. Nature 
384(6605): 129-34. 

Marsters SA, Sheridan JP, Donahue CJ, Pitti RM f Gray CL, Goddard AD, Bauer KD and 
Ashkenazi A. (1996) Apo-3, a new member of the tumor necrosis factor receptor family, 
contains a death domain and activates apoptosis and NF-kappa p. Current Biology 6(12): 
1669-76. 

Rothe M, Xiong J, Shu HB, Williamson K, Goddard A and Goeddel DV. (1996) l-TRAF is a 
novel TRAF-interacting protein that regulates TRAF-mediated signal transduction. Proc. Natl. 
Acad. Sci. USA 93: 8241-8246. 

Yang M, Luoh SM, Goddard A, Reilly D, Henzel W and Bass S. (1996) The bglX gene 
located at 47.8 min on the Escherichia coli chromosome encodes a periplasmic beta- 
glucosidase. Microbiology 142: 1659-65. 

Goddard AD and Black DM. (1996) Familial Cancer in Molecular Endocrinology of Cancer. 
Waxman, J. Ed. Cambridge University Press, Cambridge UK, pp. 187-21 5. 

Treanor JJS, Goodman L, de Sauvage F, Stone DM, Poulson KT, Beck CD, Gray C, Armanini 
MP, Pollocks RA, Hefti F, Phillips HS, Goddard A, Moore MW, Buj-Bello A, Davis AM, Asai N, 
Takahashi M, Vandlen R, Henderson CE and Rosenthal A. (1996) Characterization of a 
receptor for GDNF. Nature 382: 80-83. 

Klein RD, Gu Q, Goddard A and Rosenthal A. (1996) Selection for genes encoding secreted 
proteins and receptors. Proc. Natl. Acad. Sci. USA 93: 71 08-71 13. 

Winslow JW, Moran P, Valverde J, Shih A, Yuan JQ, Wong SC, Tsai SP, Goddard A, Henzel 
WJ, Hefti F and Caras I. (1995) Cloning of AL-1, a ligand for an Eph-related tyrosine kinase 
receptor involved in axon bundle formation. Neuron 14: 973-981 . 

Bennett BD, Zeigler FC, Gu Q, Fendly B, Goddard AD, Gillett N and Matthews W. (1995) 
Molecular cloning of a ligand for the EPH-related receptor protein-tyrosine kinase Htk. Proc. 
Natl. Acad. Sci. USA 92: 1866-1870. 

Huang X, Yuang J, Goddard A, Foulis A, James RF, Lernmark A, Pujol-Borrell R, 
Rabinovitch A, Somoza N and Stewart TA. (1995) Interferon expression in the pancreases of 
patients with type I diabetes. Diabetes 44: 658-664. 

Goddard AD, Yuan JQ, Fairbairn L, Dexter M, Borrow J, Kozak C and Solomon E. (1995) 
Cloning of the murine homolog of the leukemia-associated PML gene. Mammalian Genome 
6: 732-737. 
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Goddard AD, Covello R, Luoh SM, Clackson T, Attie KM ? Gesundheit N, Rundle AC, Wells 
JA, Carlsson LMTI and The Growth Hormone Insensitivity Study Group. (1995) Mutations of 
the growth hormone receptor in children with idiopathic short stature. N. Engl. J. Med. 333: 
1093-1098. 

Kuo SS, Moran P, Gripp J, Armanini M t Phillips HS, Goddard A and Caras IW. (1994) 
Identification and characterization of Batk, a predominantly brain-specific non-receptor protein 
tyrosine kinase related to Csk. J. Neurosci. Res. 38: 705-715. 

Mark MR, Scadden DT, Wang Z, Gu Q, Goddard A and Godowski PJ. (1994) Rse, a novel 
receptor-type tyrosine kinase with homology to Axl/Ufo, is expressed at high levels in the 
brain. Journal of Biological Chemistry 269: 10720-10728. 

Borrow J, Shipley J, Howe K, Kiely F ( Goddard A, Sheer D, Srivastava A, Antony AC, 
Fioretos T, Mitelman F and Solomon E. (1994) Molecular analysis of simple variant 
translocations in acute promyelocyte leukemia. Genes Chromosomes Cancer 9: 234-243. 

Goddard AD and Solomon E. (1993) Genetics of Cancer. Adv. Hum. Genet 21: 321-376. 

Borrow J, Goddard AD, Gibbons B, Katz F, Swirsky D, Fioretos T, Dube I, Winfield DA, 
Kingston J, Hagemeijer A, Rees JKH, Lister AT and Solomon E. (1992) Diagnosis of acute 
promyelocytic leukemia by RT-PCR: Detection of PML-RARA and RARA-PML fusion 
transcripts. Br. J. Haematol. 82: 529-540. 

Goddard AD, Borrow J and Solomon E. (1992) A previously uncharacterized gene, PML, is 
fused to the retinoic acid receptor alpha gene in acute promyelocytic leukemia. Leukemia 6 
Suppl3: 117S-119S. 

Zhu X, Dunn JM ( Goddard AD, Squire JA, Becker A, Phillips RA and Gallie BL. (1992) 
Mechanisms of loss of heterozygosity in retinoblastoma. Cytogenet. Cell. Genet 59: 248-252. 

Foulkes W, Goddard A. and Patel K. (1991) Retinoblastoma linked with Seascale [letter]. 
British Med. J. 302: 409. 

Goddard AD, Borrow J, Freemont PS and Solomon E. (1991) Characterization of a novel zinc 
finger gene disrupted by the t(15;17) in acute promyelocytic leukemia. Science 254: 1371- 
1374. 

Solomon E, Borrow J apd Goddard AD. (1991 ) Chromosomal aberrations in cancer. Science 
254: 1153-1160. 

Pajunen L, Jones TA, Goddard A, Sheer D, Solomon E, Pihlajaniemi T and Kivirikko Kl. 
(1991) Regional assignment of the human gene coding for a multifunctional peptide (P4HB) 
acting as the p-subunit of prolyl-4-hydroxylase and the enzyme protein disulfide isomerase to 
17q25. Cytogenet Cell. Genet 56: 165-168. 

Borrow J, Black DM, Goddard AD, Yagle MK, Frischauf A.-M and Solomon E. (1991) 
Construction and regional localization of a Not\ linking library from human chromosome 17q. 
Genomics 10: 477-480. 

Borrow J, Goddard AD, Sheer D and Solomon E. (1990) Molecular analysis of acute 
promyelocytic leukemia breakpoint cluster region on chromosome 17. Science 249: 1577- 
1580. 
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Myers JC ( Jones TA. Pohjolainen E-R, Kadri AS, Goddard AD, Sheer D, Solomon E and 
Pihlajaniemi T. (1990) Molecular cloning of 5(IV) collagen and assignment of the gene to the 
region of the region of the X-chromosome containing the Alport Syndrome locus. Am. J. Hum. 
Genet 46: 1024-1033. 

Gallie BL, Squire JA, Goddard A, Dunn JM, Canton M t Hinton D, Zhu X and Phillips RA. 
(1990) Mechanisms of oncogenesis in retinoblastoma. Lab. Invest 62: 394-408. 

Goddard AD, Phillips RA, Greger V, Passarge E, Hopping W, Gallie BL and Horsthemke B. 
(1990) Use of the RB1 cDNA as a diagnostic probe in retinoblastoma families. Clinical 
Genetics 37: 117-126. 

Zhu XP, Dunn JM, Phillips RA, Goddard AD, Paton KE, Becker A and Gallie BL (1989) 
Gprmline, but not somatic, mutations of the RB1 gene preferentially involve the paternal 
allele. Nature 340: .31 2-314. 

Gallie BL, Dunn JM, Goddard A, Becker A and Phillips RA. (1988) Identification of mutations 
in the putative retinoblastoma gene. In Molecular Biology of The Eve: Genes, Vision and 
Ocular Disease . UCLA Symposia on Molecular and Cellular Biology, New Series, Volume 88. 
J. Piatigorsky, T. Shinohara and P.S. Zelenka, Eds. Alan R. Liss, Inc., New York, 1988, pp. 
427-436. 

Goddard AD, Balakier H, Canton M, Dunn J, Squire J, Reyes E, Becker A, Phillips RA and 
Gallie BL. (1988) Infrequent genomic rearrangement and normal expression of the putative 
RB1 gene in retinoblastoma tumors. Mol. Cell. Biol. 8: 2082-2088. 

Squire J, Dunn J, Goddard A, Hoffman T, Musarella M, Willard HF, Becker AJ, Gallie BL and 
Phillips RA. (1986) Cloning of the esterase D gene: A polymorphic gene probe closely linked 
to the retinoblastoma locus on chromosome 13. Proc. Natl. Acad. Sci. USA 83: 6573-6577. 

Squire J, Goddard AD, Canton M, Becker A, Phillips RA and Gallie BL (1986) Tumour 
induction by the retinoblastoma mutation is independent of N-myc expression. Nature 322: 
555-557. 

Goddard AD, Heddle JA, Gallie BL and Phillips RA. (1985) Radiation sensitivity of fibroblasts 
of bilateral retinoblastoma patients as determined by micronucleus induction in vitro. Mutation 
Research 152: 31-38. 
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SIMULTANEOUS AMPLIFICATION AND DETECTION Of 
SPECIFIC DNA SEQUENCES 

Russell HigucW*, Gavin DolHager 1 , P. Sean Walsh and Robert Griffith 

Roche Molecular Syuem*. Inc.. 140053rd St., Emeryyttk, CA 9*508- 'Chiron Corporation, 1400 53rd 5u cVwryviflc, CA 
jMtiOB. ^Corresponding author. 



We have enhanced the polymerase chain 
reaction (PGR) such that specific I>NA 
sequences can be detected without open- 
ing the reaction tube* This enhancement 
requires the addition of ethidium bromide 
(EtBr) to a PGR. Since the fluorescence of 
EtBr increases in the presence of double* 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be easily moni- 
tored externally. In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta- 
neously amplify specific DNA sequences 
and detect the product of the amplification 
both simplifies and improves PGR and 
may facilitate its automation and more 
widespread use in the clinic or in other 
situations requiring high sample through- 
put 

Although the potarialbeiwm.ts of PGR 1 to din- 
kal diagnostics arc weH kuowif-'Vit is "ill not 
widely used in this setting; even though it is 
femr year* aiuco thcrnkwtabte DNA potymrr- 

ase* 4 made PCR practicaL Some of she reasons fof tts slow 
acceptance are high coat, tack of automation of pre* and 
post-PCR processing steps, and false positive results, from 
carryover -contamination. The ftm two points arc related 
in that labor is the largest contributor to cost ait the present 
stage of PCR development. Most Current assays requite 
some form of "downstream" processing once thermocy* 
cHng ts done in order to determine whether the target 
DNA sequence was present and has amplified. These 
include DNA hybrkiiwO*on** ( gel electrophoresis with or 
without use of restriction digestion 7 '*, HPLCr, or capillary 
electrophoresis 10 . These methods are labor-intense, have, 
low throughput, and arc difficult to automate. The third 
point is abo closely related to downstream processing. 
The handling of the PCR product in these downstream 
processes increases the chances that amplified DNA will . 
apread through the typing lab, resulting in a risk of 



"carryover" fMse positives in subsequent testing 11 . 

These downstream processing steps would be elimi- 
nated if specific ampuficauon and detection of amplified 
DMA took place simultaneously within an unopened re- 
action vessel Assays m which such different processes take 
place without the need to separate reaction components 
have been termed ''homogeneous''. No truly homoge- 
neous PCR assay has been demonstrated to date, although 
progress towards this end has been reported. Chefcab, et 
aL*% developed a FCR product detection scheme using 
fluorescent primers that resulted in a fluorescent PCR 
product AUe^pecific primers, each with different fluo- 
rescent tags, were used to indicate the genotype of the 
DNA. However, the unincorporated primers must still be 
removed in a downstream process in order to visualize the 
result- Recently, Holland, et developed aa assay in 
which the endogenous $ r exdnudease assay of Taj DNA 
polymerase was exploited to cleave a labeled oligonucleo- 
tide probe. The probe would only ckave if FCR ampli- 
cation had produced its complementary sequence. Id 
order to detect the dcavage products, however, a subse- 
quent process is again needed. 

We have developed a truly homogeneous assay tor FCR 
and PCR product detection based upon the gready in- 
creased fluorescence that ethidium bromide and other 
DNA binding dyes exhibit when they ate bound toJU- 
DNA 14 ^ 10 . As outlined in Figure J, a prototype PCR 
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nwHi 1 Principle of simultaneous amplification and detection of 
PCR product The coitiponcnuof a PCR tonwrtnhng EtBr lhat aro 
Quoreseent are listed— EtBr itself, EtBr bound to other ssD HA ot 
daDN A. There is a large ujjorewence enhancement when EtBr is 
bound to DNA and binding is greatly enhanced when DNA is 
dcnible-strandccl After sumdent <n)..cydcs of PCR* the .net 
in crca*e in dipNA t€5iiks in addrtionai EtBr biodiii^ and a net 
increase in total fluorescence 
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WWj 2 Gd electrophoresis of PCS. Amplification products of the 
human, mtctear gene, HLA DQa, made in the presence of 
increasing amounts of EtBr (up to 8 H-g/tnl). The presence of 
fcujr lias tic obvious effect on the yield or »pcdfiuty of amplifi- 
cation. 
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RGOdf I (A) Fluorescence measurement* from PCRs that contain 
0.5 pgfrnl EtBr and that are specific for Y«chfotno$oxne repeat 
sequence*. Five replicate FGRs were begun containing each ofttbe 
DNA* specified. At each indicated cycle, one of the five replicate 
PCRs for each DNA -was i cju w cd from thermocyding and its 
fluorescence measured. Unitt of fluorescence are arbitrary, (ft) 
UV photography of PCR tube* (0,5 ml Eppcndorf^itylc, polypro- 
pylene microcentrifuge tubes) containing reactions, those scatfe 
ing from 2 ng male DNA and control reactions without any DNA, 
from (A), 



begins with primers that are single-stranded DNA (ss- 
DNA), dNTPs, and DNA polymerase! An' amount of 
dsDNA containing the target sequence (target DNA) is 
also typically present. This amount can vary, de] 



on the application, from single-cell amounts of DNA 17 to 
micrograms per PGR* 6 , If EtBr is present, the reagents 
that will fluoresce, in order of increasing fluorescence, are 
free EtBr itself, and EtBr bound to the single-stranded 
DNA primers and to the doublc^tntnded target DNA (by 
its intercalation between the stacked bases of the DNA 
doublohcnx}* After the first denatu ration cyde, target 
DNA will be largely singic-stranded. After a PCS is 
completed, the most significant change is the increase in 
the amount of dsDNA (the PGR product itself) of up to 
several mkrograms. Formerly free EtBr is bound to the 
additional dsDNA, resulting in an increase m fluores- 
cence. There is also some decrease in the amount of 
ssDNA primer, but because the binding of EtBr to ssDNA 
is much less than to dsDNA, che effect of this change on 
the total fluorescence of the sample is small. The fluores- 
cence increase can be measured by directing excitation 
illumination through the walls of the amplification vessel 
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before and after, or even continuously during, thermocv 
ding. r 

RESULTS 

PGR in the presence of EtBr. In order to assess 
affect of EtBr io PGR, amplifications of the human 
DQet gene >9 were performed with the dye present at 
concentrations from 0,06 to 8.0 p-g^ml (a typical concen- 
tration of EtBr used in staining of nucleic aads folfewing 
gel electrophoresis is 0*5 H-g/mf), As shown in Figure 2, gel 
electrophoresis revealed little or no difference in the yield 
or quality of the amplification product whether EtBr was 
absent or present at any of these concentra ti ons, indicat- 
ing that EtBr does not inhibit PGR, 

Detec t io n of human Y-clironKworac specttie 
(prances* Sequence-spcciric, fluorescence enhancement of 
EtBr as a result of FCR was demonstrated in a series of 
amplifications containing 0.5 ugfrnj Et&r and primers 
specific to repeat DNA sequences found on the human 
Y-chromosomc 20 . These PCRs initially contained cither 
60 ng male, 60 ng female, 2 ng maJc human or no DNA. 
Five replicate PCRs were begun for each DNA* After 9, 
IV, 21, 24 and 29 cycles of thermocyding, a PGR for each 
DNA was removed from the thermocyder, and its fluo- 
rescence measured in a spectrofraorometer and plotted 
vs. amplification cyde number (Fig. 3A). The shape of this 
curve reflects the fact that by the tune an increase in 
fluorescence can be detected, the increase in DNA U 
becoming linear and not exponenual with cyde number; 
As shown, the fluorescence increased aboujt three-fold 
over the background fluorescence for the PCRs contain* 
ing human male DNA, but did not significantly increase 
for negative control PCRs, which contained either no 
DNA or human female DNA, The more male DNA 
present to begin with — 60 ng versus Z ng—<he fewer 
cycles were needed to give a detectable increase in fluo- 
rescence. Gel electrophoresis oo the products of these 
amplifications showed that DNA fragments of the ex- 
pected size were made in the male DNA containing 
reactions and that Utile DN A synthesis took place in the 
control samples. 

In addition, the increase in. fluorescence was visualized 
by simply laying the completed, unopened PCRs on a UV 
transilhiminatOT and photographing them through a red 
filter. This is shown in figure 5B lor the reactions thai 
began with 2 ng male DNA and those with no DNA- 

Detection of specific allele* of the human fl-globm 
gene* In order to demonstrate that this approach has 
adequate spedfldcy to allow genetic screening, a detection 
of the skkie-ccu anemia mutation was performed- Figure 
4 shows the fluorescence from completed amplication* 

containing EtBr (O.S y>Qtt*d} a$ detected by portography 
of the reaction tubes on a UV transillominator. These 
reactions were performed using- primer* spedftc for ci* 
ther the w3d-tvpe or skkk-cell mutation of the human 
P-globin gene* \ The specificity for each allele is imparted 
by placing the sickle-mutation site at the terrnina] V 
nucleotide of one primer. By using an appropriate primer 
annealing temperature, primer extension — and thus an> 
plineatjoh-^can take place only if the 3' nucleotide of the 
primer t$ complementary to the ^-globul allele present*' 

Each pair of amplications shown in Figure 4 consists of 
a reaction with either the wiW-typc allele spedfic (left 
tube) or sfcklc-aUele specific (right tube) primers. Three 
different DN As were typed: DNA from a homozygous* 
wHd-typc p~gk>bin individual (AA); from a heterozygous j 
sickle pMgipbin individual (AS); and from a homozygous ! 
sickle p-gio&n individual (S3). Each DNA (50 ng genomic 
DNA to start each PGR) was Analyzed in tinpKcate (3 pain 
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c f reactions each). The DNA .type vas reflected in the 
rC lattve fluorescence intend tie* in each pair of completed 
flm pltficatk>n$. There was a significant increase in fluorea- 
pence only where a $~globin aOele DNA matched the 
primer net. Whco measured on a spcctroflnororaetcr 
M a ta not shown), this fluorescence was about three times 
j^t present in a PGR where both p-dobm alleles were 
,nbiti»tchcd to the primer set. Gel electrophoresis (not 
phown) established that thus increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
fragment of the expected size for 0-globin. There was 
iitdc synthesis of dsDNA in reactions in which the aflele- 
specific primer was mismatched to both alleles* 

Conrimioos snontaoriog of a PGR. Using a fiber optic 
devkern is possible to direct excitation illumination from 
fl 5 pectrofl uorometcr to a PGR undergoing thcrmocyding 
and to return its fluorescence to the spectroftuorometer* 
The fluorescence readout of such an arrangement, di- 
rected at an EtBr^con Gaining amplificadon ofY-chromo- 
aonic specific sequences from 25 ng of human mate DNA, 
is shown in Figure 5. The readout from a control PCR 
whh no target DNA is also shown. Thirty cycles of PCR 
w erc monitored for each. 

The fluorescence trace as a function of time dearly 
shows the effect of the themocyding. Fluorescence inten- 
sity rises and fails inversely with temperature. The fluo- 
rciecncc intend ty is minimum at the denaturation tem- 
pera tore (94°C) and maadmum at die anneaUn^eattension 
temperature (SOX). In the negative-control PCR, these 
fluorescence maxima and minima do not change signifi- 
cantly over the thirty thcrmocycks, indicating that there is 
tHtk dsDNA synthesis without the appropriate target 
DNA, and there is little if any Wewfung of EtBr during 
the continuous illumination of the sample. 

In die PCR containing male DNA, the fluorescence 
maxima at the annealing/extension temperature begin to 
increase at about 4000 second* of thennoeyding, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable level. Note that the fluo- 
rescence minima at the denaturation temperature do not 
significantly increase* presumably because at thh temper- 
ature there is no dsDNA for EtBr to bind. Thus the course 
of the amplification is followed by tracking the fluores- 
cence increase at the annealing temperature. Analysis of 
the products of these two amplifications by gel electropho- 
resis showed a DNA fragment of the expected sire for the 
male DNA containing sample and no detectable DNA 
synthesis for the control sample. 

DISCUSSION 

Downstream processes such as hybridization to a se- 
quence-specific probe can enhance die specificity of DNA 
deceuiivu by PCR. The eHrtrioatiori of ihcac processes, 
means that* the specificity of this homogeneous assay 
depends solely on that of PCR. In the case of sickle-cell 
disease, we ha ve shown that PCR alone has sufficient DNA 
sequence spedficiLy to permit genetic screening. Using 
appropriate amplification conditions, there is little non- 
specific production of dsDNA in the absence of the 
appropriate target allele. 

The specificity required to detect pathogens can be 
more or less than that required to do genetic screening, 
depending on the number of pathogens in the samnle and 
the amount of other DNA that must be talten with the 
sample. A difficult target is HIV, which requires detection 
of a viral genome that can be at the level of a few copies 
per thousands of host cells*. Compared with generic 
screening, which is performed on cells containing at least 
one copy of die target sequence* HIV detection requires 
both more specifiaty and the input of mote total 
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. UV photography of PCR tubes containing ampUfirauoiii 
using EtBr that are specific to wild-type (A) or liekk (S) alleles of 
the human 0-globin gene. The left of each pair of tube* contains 
aBele-tpcdfic primers to die wild-type alleles, the right tube 
primers to the skWe aflele- The phc&graph was taken after 30 
cycles of PCR, and the input DNA* and the alleles ihey contain 
are indicated- Fifty ng of UNA was used to begin PCR, Typing 
was done in triplicate (3 pain Of PC&O for each input DNA: 
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RGtAIS Continuous rcaUrme morutoring of a PCR. A fiber optk 
wa» wed to earry cxdutkm light to a PCR m progress and aUo 
emted light back to a fluorometcr (see Experimental pVotorol). 
Amplification using human male-DNA specific primers in a PCR. 
Starting with 20 ng of human male DNA <top) T or to a control 
PCR without DNA (bottom), were raonhorcd. Thirty cydes of 
PGR were followed for each. The temperature cycled between 
94*C (denaturaticm) and 50*C (annealing and extension). Note in 
the male UNA PCR, the cycle (tunc) dependent increase in 
fluorescence at the anneaHng/extenaion temperature. 
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DNA — lip to microgram amount*— ia order to have suf- 
ficient numbers of target sequences. This large amount of 
starting DNA m an amptthCatkwi signitontly increases 
the background fluorescence over whidb any additional 
fluorescence produced by PCR. must be detected. An 
additional complication that occurs with targets in tow 
copy-number is the formation of the "primer-dimer" 
artifact. This is the result of the extension of one primer 
using the other primer 35 a template. Although this occurs 
infrequently, once it occurs the extension product is a 
substrate for PGR amplification, and can compete with 
true PCR targets if those targets are rare. The primer- 
dinner product i$ of course dsDNA and thus h a potential 
source of false signal in this homogeneoux assay. 

To increase PCR specificity and reduce the effect of 
primer-dimcr amplification, we are investigating a num- 
ber of approaches, including the use of nested- primer 
amplification* that take place in a single tube 8 , and the 
<k hot.start**, in which nonspecific amplification » reduced 
by raising the temperature of the reaction before DNA 
synthesis begins 25 . Preliminary resuhs using these ap- 
proaches suggest that T>rijjicr-dlrocT b effectively reduced 
and it is possible to detect the increase in Etfir fluores- 
cence in a PCR instigated by a single HIV genome in a 
background of 10* celts. With larger numbers of ceHs, the 
background fluorescence contributed by genomic DNA 
become* problematic. To reduce this background, it may 
be possible to use sequence-specific DNA-binding dyes 
that can be made to preferentially bind PCR product over 
genomic DNA by incorporating the dye-binding DNA 
sequence into the PCR product through a 5' "add-on" to 
the oligonucleotide primer 2 ' 4 . 

We nave shown that the detection of fluorescence 
generated by an EtBr-containing PCR is straightforward, 
both once PGR is completed and continuously during 
thermocycHng. The ease with which automation of spe- 
cific DNA detection can be accomplished is the most 
promising aspect of this assay. The fluorescence analysis 
of completed PCRs is alrcadynossiblc with existing instru- 
mentation in 96-weJl format**. In this format, the fluores- 
cence in each PCR can be cjuantitated before* after, and 
even at selected points dunng therraocyeting by moving 
the rack of PCRs to a. 9o-rmerovvcJl plate fluorescence 
reader 46 . 

The instrumentation necessary to continuously monitor 
multiple PCRs simultaneously is also simple in .principle. 
A direct extension of the apparatus used here is to have 
multiple fiberoptics transmit the excitation light and flu- 
orescent emissions to and from multiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target UNA copy number. Figure 3 shows that 
the larger the amount of starting target DNA, the sooner 
during PGR a fluorescence increase is detected. Prelimi- 
nary experiments <Hignchi and DoUiriger, manuscript in 
preparation) with continuous monitoring have shown a 
sensitivity to two-fold differences in initial target DNA 
concentration. 

Conversely, if the number of target molecules is 
known — as it can be in genetic screening— continuous 
monitoring may provide a means of detecting false posi- 
tive and false negative result*. With a known number of 
target molecules, a true positive would exhibit detectable 
fluorescence by a predictable number of cydes of PCR. 
Increases in fluorescence detected before or after that 
cycle would indicate potential artifacts* False negative 
results due to, for example, inhibition of DNA polymer- 
ase, may be detected by including within each PCR an 
inefficiently amplifying marker. This marker results in a 
fluorescence increase only after a large number of cy- 
cles — many more than arc necessary so detect a true 
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positive. If a sample fails to have a fluorescence increase 
alter this many cycles, inhibition may be suspected. Since, 
b this assay, conclusions are drawn based on the presence 
or absence of fluorescence signal alone, such controls may 
be important. In any event, before any test based on this 
principle is ready for the clinic an assessment of its false 
positivetfalse negative rates will need to be obtained using 
a large number of known samples. 

In summary, the inclusion in PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detect specific DNA amplification from outside 
the PGR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PCR 
in applications that demand the high throughput of 
samples. 

EXPERIMENTAL FfiOTOCOX, 

Human HLA-DQn gene amplifications cfrotaining 
PCRs were wt up in 100 uj volumes containing 10 mM TYis-HCK 
pH 8.3; 50 mM KCI; 4 mM MgC^: *-5 units of too DNA 
polymerase (PerluiwEhrtcr Genu. Norwalk, CT); 20 pinole each 
of human HtA-BQa ' gene specific oligonucleotide primers 
imtb and CH27 19 and approximate^ 1<F copies of DQ& PCR 
product diluted from a previous Reaction. Ethidium bromide 
(Et&r; StgtftA} was used at the concentrations indicated in Figure 
2. Theraocycnng proceeded for 20 cycles in a model 430 
thcrmocyclcr {Perkjft-EJmer Ccw, Norwstk, CT) using a "step- 
cycle" program of 94*C for 1 min. denaturation and 6CrC for 30 
sec Cheating and 72°C for 30 see. extension. 

Y-chronto9omc specific PCXL PCRs (100 \il total reaction 
volume) containing UJ> WX/rtiJ EtBr were prepared as described 
for HLA-DQ?, except with different primers and target DNAs. 
These PCR* contained 1 5 pmotc each male DN A-ipCCtfk primer* 
VI. 1 and Vl.2 M , and cither 60 ng male, 60 eg female, 2 ng male, 
ot no human DNA. ThermocYclmg was 94*C Tor 1 min- and 60?C 
for J min using a "rtcp-cyde* program. The number of eyefc* for 
a sample were as indicated in Figure 3. Fluorescence measure- 
ment 1.1 described below. 

AUe4e-specific, human 0-gtobia get* PCR. AmpUneauons of 
100 h-1 volume tfstxig 05 jig/ml of ZtBr were prepared as 
described for HLA^DQa above except nidi different prittXT* and 
target DNAs. These PCRs contained either primer pair HOPS/ 
HjpMA <wfld-type globin spedne primers) or HGF2/H|U'tS (sick- 
le-giobin specific primers) at 10 pmole each primer per PCR, 
The^ primers were developed by Wu ct at 21 . Three different 
tatget DNA* were used in separate amplifications— 50 ng cacti of 
human DNA that was homozygous for the sSckk trait (55), DMA 
that was heteroryrous for the sickle Irak (AS), or DNA that was 
homozygous fot the W,t- (*Jobm (AA). Thermocycfing was for SO, 
cycles at 94^ for 1 mis. and 55*0 for 1 min. Using a "atcixYcte'' 
program. An annealing temperature of 55^0 bad been shewn fry 
Wu ct al*' to provide alleJe-spccinc ampliation* Completed 
PCRs were photographed through a red fitter {Wratten 23A) 
after placing the reaction tubes atop a model TM-S6 transfflunu- 
nator (UV-preducts San-Gabriel, CA). 

Fhioreseence measurement. Fluorescence racasuremen w were 
madV oh PCRs containing Et»r in a Fluorolog-2 0ttoromCter 
(SPEX, Edison, NJ). £x<itation was at the 500 nra band with 
ahour 2 nm bandwidth with a OG 43S nm cut-offfilterjMelles 
Crist, Inc., Irvine. CA) to exclude second-order light Emitted 
light was detected at 5 V0 nm with a bandwidth of about 7 nm« An 
OG 530 pm cut-off filter was used to remove the exchadon hgfrt 

ContitHtooft rhM>reflcesiee m uu i tur ln g of PGR, Continuous 
monitoring oiT a PCR in progress was accomplisbed using the 
Bpcctrcfiuorometcr and setnnga described Above as well a* a 
fiberoptic accessory (SPJEX caL no. 1950) 10 both send excitation 
light to, and receive emitted light from, a PCR placed in a well oj 
a model 4-80 wermocyelcr (Ferkm-Eliner Cetus). The probe end 
of the fiberoptic cable was attached witli "5 mm«c<poxy'' to the 
open top of a PCR tube (a O.o ml poiyvropytenc centrifuge tube 
with its cap removed) effectively scaling it. The cxposed'top 
the PCR tube and the end of the fiberoptic cable, were shielded 
from room light and the roora lights were kept dimmed during 
each run. The monitored PCR was an ampl£ncatitin of V-djTO- 
rnosome-spcdrk repeat seqvetoces as Oescribed above, except 
using^.an anncaUng/extension lemperauirc of 50°C. The reaction 
was covered with mineral oil (2 drops) to prevent evaporation. 
TTxrmtxydirig- and Sucrcjoencc measurement verc started Si- 
multaneously, A umie-basc scan, with a 10 second inlegranott tnnc 
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Wi M used and the emission signal was ratioed to tbc excitation 
ligrt-tJ to control for changes in Jightoourcc intensity. Pat* were 
^icded using the dra3O0Of» version 2,5 (SPEX) data system. 
^clrfHrtvtaap&cnt* 

Wc item* Bob lone* for help with the spcctrofluormctric 
nwvworemeiitJ andHcafherMJ Fang for editing this manuscript. 
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with Fluorescent Dyes at 
Opposite Ends Provide a Quenched Probe 
System Useful for Detecting PCR Product 
and Nucleic Acid Hybridization 

Kenneth J. Ltvak, Susan J.A. Flood, Jeffrey Marmaro, William Gtusti, and Karin Deetz 
Pcrkln-Dmcr, Applied Hloyystcms Division, F»Mer City, California 94404 



The 5' HuctoftM PCR mtmmy detects th« 
accumulation of specific PCR product 
by hybridization and cleavage of a 
double-labelsd fluorogenlc probe 
during the amplification reaction. 
The probe Is an oligonucleotide with 
both a reporter fluorescent dye and a 
quencher dye attached. An Increase 
In reporter fluorescence Intensity In- 
dicates th«t the probe has hybridized 
to* the target PCR product and h«» 
been cleaved by the 5'-»3' nucle- 
plytlc activity of T<tq DNA polymerase* 
In this study, probes with the 
quencher dy« Attached to an Internal 
nucleotide were compared with 
probe* with the quencher dye at- 
tached to the 3 '-end nucleotide. In all 
cases, the reporter dye was attached 
to the 5' end. Alt Intact probes 
showed quenching or the reporter 
fluorescence. In general, probes with 
the quencher dye attached to the 3'- 
end nucleotide exhibited a larger sig- 
nal in the 5' nuclease PCR assay than 
the Internally labeled probes* It Is 
proposed thet the larger signal Is 
caused by Increased likelihood »f 
cleavage by Taq DNA polymerase 
when the probe Is hybridized to a 
template strand during PCR. Probes 
with the quendier dye attached to 
the 3 '-end nucleotide also exhibited 
an Increase In reporter fluorescence 
Intensity when hybridised to a com- 
plementary strand. Titus* oligonucle- 
otides with reporter and quencher 
dyes attached at opposite ends can 
be used as homogeneous hybrldlxo- 



r\ homogeooou* auxay for detecting 
thv wixnu nutation of specific* 1 % CR prod- 
uct that u$e$ a double-labeled fluoro 
genie probe was described by Leu et al, n) 
The assay exploits the 5' • > 3' nude- 
olyllc activity of Taq DNA poly- 
iiitiaac* 7 '^ mid b dluyrunicd in 1-lgure 1. 
The fluorogenit: pruhtt t on^iiits of an oli- 
gonucleotide with y reporter fluorescent 
dye, >u0y a:> is fluorescein, attached TO 
the 5' eml; and a quencher dye* such as a 
rhoclaminn, Attached Internally. When 
the fluorescein is excited by irradiation, 
Us fluorescent omission will be 
quenched if the ihmliiiiiine b Close 
enough to be excited through the pro- 
cess of fluorescent:!' energy trans! ef 
(MT)."- 5 > During PCR, if the probe is hy. 
bridized to a template frtiaixd, Taq DNA 
polymerase will cleave the probe be- 
cause of its inherent A' 3' nucleolytic 
activity. If the cleavage occur* between 
the fluorescein and rhodaminc dyes, it 
cause* an increase in fluorescein fluores- 
cence intensity because the fluorescein 
is nu longer quenched. The increase in 
fluorescein fluorescence Intensity Icidi* 
cnles i hut the probe-specific PCR product 
has hvun gene rut vdf Thus, FET between a 
irjrtiitci dye and a quencher dye is criti- 
cal to the performance of the piobe lu 
the S c uuiltttM.- PCR rtUMiy. 

Qucnrhing is completely dependent 
on the physical proximity of thv two 
dycs. uo Because of this, it lias Xxnitt as- 
sumed thai the quencher dye mu»L be 
attached ilea] the 5' end. Surprisingly, 
we have found that attaching a rho- 
damiue dye oL the 3' cud of a piolie 



I*CR. assay. )>urt her more, cleavage of this 
type of probe \s not required to achieve 
some reduction In quenching*. Oligonu- 
cleotides with a reporter dye on the V 
find and a quencher dye on the 3' end 
exhibit a much higher reporter fluores< 
ceiicc when dou Die-stranded as com- 
pared with aiingle-strandcd. This should 
make it possible to use this type of dou- 
ble ►labeled probe for homogeneous de- 
tection of nucleic acid hybridization! 



MATERIALS AND METHODS 
Oligonucleotides 

Table 1 shows the nucleotide sequence 
of the oligonucleotides used in this 
study. Linker arm nucleotide. (LAN) 
phoaphoramidhc was obtained from 
<SJcn Research. The standard DNA phos- 
phor&miditcs, 6-carboxyfluorcsccin (6* 
FAM) phosphoraiiiJdite, fi -car boxy tet* 
ramethylrhodamine succinimtdyl ester 
(TAMRA NHS ester), and Pbosphnlink 
for attaching a 3' -blocking phosphate, 
were oDUlncd from Parkin-Elmer, Ap- 
plied Blosyslems Division. Oligonucle- 
otide synthesis was performed using an 
AB1 model 394 DNA synthesiser (Applied 
Biosystems). Primer and complement 
ollgonuelenndes were purlfica using 
Oligu Purification Cartridges (Applied 
Blosyaletm). Dwublc-luU:U:d jin>bes were 
.fyndicsir.ed with 6-FAM»labeied phov 
plioiduitiliU: at the ,S' tutO, JAN rvpluehlg 
one «>f tbeTs In the sequence, and Thos- 
p ha J ink m the end, Pol lowing de- 
piotet;tloii ami rtliariol precipitation, 
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FlGURt 1 Diagram of 5' nuclease assay. Stepwise representation of ine 5' -* 3' nucieuiytic ac- 
tivity of 7ty UNA polymerase acting on a Audiogenic probe during one extension phase of fCR. 



him Na-blcatt>Ottfttc buffer (pi J 9.0) m 
room temporal urt. Un reacted dye w«i 
iciuuvcU by p«ia»afte ovei a PD*10 Scphti' 
dex column. Finally, the double-labeled 
probe was purified by preparative hifjh- 
pcrforrnance liquid chromatography 
(\mXl) u.iing an Aquaporc C K 220x4.6- 
mm column with 7-iun particle size. The 
column wm developed with a 24»mln 
linear gradient of 9-2006 ucctonUrlhr in 
0,) m TEAA (trkthylamtnc occtatc), 
Probes are named by designating the se- 
quence from Table 1 and the position, of 
the MN'-TAMKA moiety. Vor example, 
probe A3-7 has sequence Al with LAN- 
TAMKA at nucleotide position 7 from the 
V end. 



PCR ayaicrm 

All PCR amplifications were performed 
in the Pcrkm-Elroer CcneAmp PCR Sys- 
tem %00 using M>-uJ reactions lhat con- 
tained 10 mM Tris-HCl <pH tt.3), 50 iiim 
KCI, 200 u-m dA'ir, 200 \lm dCl P, 200 u-M 
dGTP, 400 m-m dUTP, 0.5 unit of AnipEr- 
ase uracil ^-glycosylate (PerkirwElmcr), 



gene (nucleotide) 2141-2435 in the se- 
quence of Nako|lmo-Il)lma et al.) m was 
amplified using [jiiuien APP nod Ait? 
(Table 1), which are modified slightly 
from those of du Brcull ct ah (ft) Actln am- 
plificotlon reactions contained 4 . aim 
Mg^J* 20 ng of human genomic 1>NA, 
50 nM Al or A3 probe, and 300 nw each 



TABLE 1 Sequences of Oligonucleotides 



primer. The thermal regimen was S0 U () 
(2 mln), Wfi (10 mm), 40 cycle* of 9.W 
(20 aec), 60°C (1 mln), and hold at 72 6 C 
A 315-bp segment wax amplified from a 
plasmid that consists oi a segment ol X 
ONA (nucleotides 32,200-3?., 747) in- 
serted in the Sma\ situ of vector pUCl 19. 
'Hurif reactions ujutulmttl 3.5 him 
Mgc:i ai 1 ng of plus in id DNA, 50 riM YZ or 
P5 probe, 200 nwi primer P11S>, arid 200 
iim piunci' R119. The thermal regimen 
WW 50"C (2 mln), 9$*C (10 mln), 25 cy- 
cle* Of (20 sec), 57%: 0 mln), one! 
hold at 72 D C 



Plunrwcencr Detection 

Vot each amplification reaction, a 40-uJ 
aliquot of a sample was transferred to an 
Individual well of a white, 96.w«)l micro* 
titer plate (Peikin-lllmer). Fluorescence 
was measured on the Pcrkin-Elmcr Taq- 
Man Ui-5011 System, which consists of a 
luminescence, spectrometer with plaic 
reader aa&cmbly, & 485-nm excitation fll* 
ter, and a 515»nm emiiition filler. Exeita> 
Hon was at 4Rf5 mn using a 5-nm slit 
width. Emission was measured at 518 

nm for 6-l ? AM (the reporter or K value) 
and £82 nm for TAMUA (the quencher or 
Q value) using a ICnm slit width. To 
dclctniliic the inticasc in icuuilei emis- 
sion that 1m Co used by cleavage of the 
probe during PGR, three normalization* 
aic applied to the raw emiviion dale. 
First, emission Uneasily of a buffet blank 
Is subtracted fur each wavelength. Sec- 
ond, emission intensity of the reporter Is 



Name 


'iype 


.Sequence 


FU9 


primer 


ACCr^C^GCUACrCAlCACCACTC 


H139 


primer 


AitJTccicXj rrccOGC:rc;Afxmd c;c 


pa 


probe 


TOGCA'riACI OAiXXinX-CCAACCACTp 




_ complement 


CTACrccrrcccAAcx;ATCA(rrAATOr.cui , G 


PS 


probe 


CcjOA'irrGCixjo^rAU'crAix^AAccAiv 


rsc 


ctfinplomciu 


nr^Tccnrcrrx^TAC^iAWAOC^AA 


AW 


primer 


TC^CCCACACTGTGCCCATCTACQA 


ART 


piirncr 


CACiCCI CiAAl XXiCTl AITtt COMTOfJ 


A1 


prolx: 


ATGCCCi<XXCCAiX5CCAlCCitX:0Tp 


MCI 


complein«nt 


ACL\t:t:iu(it;A'ix:c(^Tt;t;c;c;t;A<;(;Tjcu'rAC 


A3 


piobc 


CGCCCTC0ACl'rCCACCAAOA0AT|» 


A3r 


CUTXiplemeut 


CrATCTCTTOCTrCaAAGTCCAGfiGCRAC 



For each oliRonurlcuridc used In this study, the nucleic add sctrucricv b given, written in (he 
5' * 'A 4 diryctioii. Theie are three types of uUgumidcotidea; PCR piimeri fluorogenle probe used 
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Prob* 

Ai-a - 

A1*14 
A1-« 
M-22 
A1-26 



A1-2 
A1-7 
A1-14 
A1-10 

A 1-26 



518 nm 



Haqoc :c.rri •c.t: ocxjerccc a'jcctccgtp 



682 nm 

no t»cnf>. » l»mp. 



wa- 



rn* 



AftO 



36 5 d. 2.1 32.7 A 1.0 90^ it 3.0 30^? i 0.0 0X7 * 0.0 1 0.80 i 0.06 O.tOdO.OC 

B3.0A4.3 306,1*21.4 108.6*6.3 110^*5.^ 040*0.0* 3.58*0.17 3X*Q*0Jft 

127,0*4.0 403.3* tfi.1 lO0.*±5.D 03.14 6.5 1. fa £0.03 AM i 0.15 3.18*0.15 

187.5*1^.0 70.3*7.4 73,0* 9,0 3.67:10.06 5.004 0,16 3,133 0,16 

224.01 0.4 48C,L I ±43.6 1OO.0±4.0 06.610.8 C^SxO.03 5.D2 ± 0,11 C77l0.1fi 

1 60 £ J 0.3 464.1118.4 yj.1*i>.4 W./±!Ui 1./fc±U.Uii 6.U1 ± O08 3JW±0.Utt 



flCURE 2 R«ulb of 5' nucleate idm/ comparing p-atfUrt probci with TAMRA At dlfffrfnt niiclc 

otitic positions. As described In Materials and Methods, POt amplification* containing the In- 
dicated probes were performed, and the fluorescence emission was measured at 518 and 562 nm. 
Reported valuta are the averaged 1 s,o» for six reactions nm without added template (no temp.) 
and six reactions run with template ( i temp,). The HQ ratio was calculated for each individual 
reaction and averaged to give the reporled~RQ* and MQ 1 values. 



divided by the emission intensity uf the 
quencher to give an RQ ratio for cadi 
reaction tube. Tills normalizes for well- 
to -wen variations in probe concentra- 
tion and fluorescence measurement. Fi- 
nally, arq is calculated by subtracting 
tnc KQ value of the no-template control 
fRQ") from the KQ value for the wuv 
plctc reaction including template 
(RQ'). 

RESULTS 

A senes of probes with increasing dis- 
tances Derwecn the fluorescein reportci 
ami rhodaminc quencher were tested to 
investigate the minimum and maximum 
spacing that would give an acceptable 
performance in the 5' nuclease l'CK as- 
say. Tnese probes hybridize to a target 



.sequence in the human p-actin gene. 
Hguic 2 shows the results of on experi- 
ment in which these probes were In- 
cluded in PCR thai amplified a segment 
of the p-iKlIn grim containing the Uigct 
sequei icr- Pcifuimance In the S' nu- 
clease I'CR assay Is monitored liy the 
magnitude of ARQ, which U a measure 
of the Increase in reporter ftvurooimw 
utuacd by PCR amplification of the 
probe target, FrobeAl-21«» « ARQ value 
that Is close to zero, indicating thai the 
probe was not cleaved appreciably tlur* 
tng the amplification rcdvtUm, Tlilh iug- 
Kcab that With the quencher dye on the 
»eixmd nucleotide from the 5' end, there 
Is Insufficient unim lot Ta^ polymerase 
to cleave efficiently between the reporter 
and (jueiiidiei. The olhcr five probes ex- 
hibited comparable ARQ values thai are 



clearly different from zero. Thus, all five 
protw arr befog cleaved 6 wing K:k am- 
plification roultinK » similar Increase 
111 leporter fluorescence. H should be 
noted thnt complete digestion of a probe 
produces a much larger increase in re- 
porter fluorescence than that observed 
in Figure 2 (data not shown). Thus, even 
in reactions where ampliation occurs, 
the majority of probe molecule* remain 
uiiclcavcd. 11 Is mainly for this reason 
that the fluorescence intensity of the 
quencher dye TAMRA chengei Hit le with 
amplification of the targei. This Is what 
allows us to use the &82-nm fluorescence, 
reading as a norm alt xatlon factor. 

The magnitude nf RQ" dr*ponrk 
mainly on the quenching efficiency in- 
herent in the. specific .structure of the 
probe and the purity of the oligonucle- 
otide. Thus, the larger RQ" values Indi- 
cate that prooes AM4, AJ-19, Al-22, find 
Al-26 probably have reduced quenching 
as compared with A3 -7, Still, the degree 
of quenching 1ft sufficient to detect a 
highly significant Increase In reporter 
fluorescence when each of these probes 
ia cleaved during PCU. 

To further Investigate the ability of 
TAMRA on the V end to quench Ti-FAM 
on the 5' end, three additional pairs of 
probes were tested in the 5' nuclease 
PCR assay, Foi each pair, one probe has 
TAMRA ottochcd to fin internal nude- 
ullUe and the otbei has TAMRA attadicd 
to the V end nucleotide. The results arc 
shown in Table % hor all three sets, "the 
probe with the 3' quencher exhibits u 
4RQ value that is considerably jji^bei 
than fot the probe with the Interna) 
quenchcT, The RQ* values suggest thnl 
differences In quenching arc not as «rnut 
as those observed with some of the Al 
probes. These results demonstrate that 0 
quencher dye on the 3 J end of on oligo- 
nucleotide can quench efficiently the 
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TABLE 2 Results of S' Nuclease Assay Comparing l>rob<ts wlOi TAMRA Attnchtid to an internal or ^'-terminal Nucleotide 



MO® 



518 nm 



5ft2 nm 



Probe 


no temp. 


+ temp. 


no ICUlp. 


+ temp. 


HQ 






A3-6 


54.6 i 3.2 
7l.\ * Z9 


84.b z 3.7 
236.5 a. 11.1 


116,2 a. 6,4 
W.2 * 4.0 


90.2 n 3.8 


0,47 j. 0.02 
0,86 a 0.02 


0.73 a. 0.0H 
2.62 i- 0.05 


0.20 ± O.UA 
1.76^0.05 


17-7 
1*2-27 


2. 4.4 
113.4 2:6,6 


3B4.0 ± 34.1 
555.4 ± 14-1 


10y) 16,4 
H0.7 + 8,5 


120.4 =r 10.Z 
118.7 a 4.a 


0.79 i 0,02 

agi ± o.oi 


3.19 * 0.36 
4.66 ± 0.10 


2.40 :< 0.1C 
3.S8 t 0.10 


I'5-IO 


77.5 ± 6-5 
64.0 i. S.2 


244.4 a 15.9 
333.6 ± 12.1 


86.7 J. 4.3 
If X1.6 * 6J 


9S.B * 6.7 
94.7 Z fi.3 


0,89 * 0,<I5 
0.A3 ± 0,02 


2.55 * 0.06 
3.53 ^ 0.12 


1.66 ± 0.08 
289 i 0.13 



«ah nwtvu^v anrt ratfuiftiinn* wiirr nerformcd us dcjcrlbed In MutexIoJ ami Mctl\6ds and in the legend rq Fig. 2, 

20S6 091 6f6 YVJ IS^T Z00Z/S0/ZT 




fluorf xtwrr of a reporter dye on the £' 
end. Tin? degree of quenching is suffi* 
tienl for (JiiN typo of oligonucleotide to 
he used as a pr6be in the .V nuclease PGR 
assay. 

To test the hypothesis thai quenching 
by a A' TAMRA depends on the flexibility 
of the oligonucleotide, fluorescence was 
i»ca*ufttiK) h>r probes in the Single- 
stranded and double stranded states, Tft* 
h\<> 3 rcporU On* fluorescence observed 
at 518 and 582 nm. The relative degree 
of quenching is assessed by calculating 
the RQ ratio. Vat probCS with TAMRA 
6-1 0 nucleotides from the S' end, there 
is little difference in the RQ values when 
comparing single*strand«d with double- 
stranded oligonucleotides, The results 
for probes with TAMRA at the 3' end aTe 
much different For these probes, by- 
briduatfon to a complementary strand 
causes a dramatic Increase in HQ. We 
propose that this loss of quenching is 
caused by the rigid Structure of double- 
Stranded !>NA, which prevents the 5' 
and 3' ends from being in proximity. 

When TAMRA is placed toward the 3' 
end, there is a marked Mg a ' effect on 
quenching. Flgur* 3 shows a plot of ob- 
served RQ values for ihu A) series of 
probes as a function of Mg 2 " 1 concentra- 
tion. With TAMRA attached near the 5' 
end (probe A 3 -2 or Al-7), the Revalue at 
0 nm Mg*" is only Slightly higher than 
RQ at 10 ium Mir 1 . l J or probes AM9, 
Al-22. and Al-26, the RQ values at 0 mu 
Mg > J are very hi^h. Indicating a much 



reduced quenching efficiency. For each 
of these probes, tlieie. h h marked de- 
crease in Hti at 1 mM Mg* * followed by 
u gradual decline as the Mg* 1 ivuccu- 
trution increases to 10 mM. 1'iubv A 1-14 
shows an intermediate RQ value at 0 mM 
M$ 9 * with u gradual dccltne at hlgner 
Mg z * coiKciUiaIUjiis. In a low-salt en- 
vironment with no Mg a " present, a sin- 
gle-stramted oligonucleotide would he 
expected to adopt an extended* confor- 
mat ion because of electrostatic repul- 
sion. The binding of Mg a + Ions act* to 
shield the negative charge of the phos- 
phate backbone so that the oligonucle- 
otide can adopt conformations where 
the 3' end is close to the 5' end. There- 
fore, the observed Mg 2 1 effects support 
the notion that quenching ol a 5* n> 
porter dye by TAMRA at or near the 3' 
end depends on the flexibility of the oli- 
gonucleotide. 

DISCUSSION 

The striking finding of this study is that 
it jeems the rhodaroine dye TAMKA, 
placed at any position in an oligonucle- 
otide, can quench the fluorescent emis- 
sion of a fluorescein (6-l ; AM) placed at 
the 5' end, This Implies that a single- 
stranded, double-labeled oligonucle- 
otide must be able to adopt conforma- 
tions where the TAMRA Is close to rhc 5' 
end. H should lie noted that Uie decay of 
6-l'AM In the excited state requires a cer- 
tain Amount of time. Therefore, what 



matter* for quenching In nut the averaf 
distance between 6'i ; AM and TAMR 
but, rather, how close TAMKA can get I 
6-KAM during die liftmme of UlC O-FAI 
excited state. As long as tbc decuy time ( 
the excited state is relatively long con 
pared wttn the molecular motions of th 
oligonucleotide, quenching can occu 
Thus, we propose that TAMRA at the : 
end, or any other position, can queue 
6-FAM at the V end because TAMRA Is i 
proximity to fi^'AM often enough to b 
able to accept energy transfer from a: 
excited 6-FAM. 

Details of the fluorescence measure 
ments remain puzzling. For example. T2 
ble 3 shows that hybridization of probe 
AI-26, A3-24, and to their complc 
mentary strands not only causes a larg 
increase in 6-FAM fluorescence at 5T 
urn but also causes a modest increase ii 
*1*AMRA fluorescent at 582 mn, J 
TAMRA Is being excited by energy trans 
fcr from quenched 6-MM, then loss o 
quenching attributable to hybrldlzatloi 
should cause a decrease In the fluorcs 
cence emission of TAMRA. The fact tha 
the fluorescence emission of TAMRA in 
creases indicates that the. situation t 
more complex. Kor example, we have an 
ecdoiai evidence that the bases of Th< 
Oligonucleotide, especially Ci, cjuenel 
the fluorescence of both 6-FAM an< 
TAMRA to some degree. When douhle 
stranded, base-pairing may reduce thi 
ability of the bases to quench. The pri- 
mary factor causing the quenching ol 
6-MM in an intact probe is the TAMll 1 
dye. Rvidence for Uie important* Oi 
TAMRA ts that e HAM fluimtscencc 
remains relatively unchanged when 
probes labeled only with 6-l r AM are usci! 
in tile S' nuclease J'CR assay (data nol 
shownj. .Secondary effectors of fluores- 
ce nee, both before and nflei cleavage oJ 
the probe, need to be. explored further. 

Regardless of the physical mcch* 
nism, The relative independence of posi- 
tion and quenching greatly simplifies 
the design of probes for the S' nuclease 
PCR asf»ay, There are three main factors 
that determine the performance of a 
double-labeled fluorescent probe in Uic 

nuclease PCR away. The first factor is 
the degree of quenching observed in the 
intact probe. Tills is Characterized by the 
vHlue of RQ , which is the ratio of re- 
porter to quencher fluorescent cmis 



TABIC J Comparison of P!uoicacc<kc Rma»»iuiis o/ sin£tc--{fTrandcd and 
Double-** r*ndcd Flu otogenic Probe* 



518 nm SQ7. nm RQ 
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v» 


ds 
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ds 


Al-7 


27.75 




61.08 


138.18 


0.45 


11,50 


A 1-26 


43.31 


509.38 


53.50 


93.86 


0.81 


5.43 


A3-6 


16.7S 


62.88 


39.33 


165.57 


0.43 


0.38 


A3-24 


mos 


578.64 


67.77. 


140.25 


0.4$ 


3.21 




35,02 


70.13 


M.tsrj 


123 .U9 


0,64 


0.58 


1*2-27 


39.ft9 


320,47 






0.61 


S.25 


l'S-10 


27.IU 


144.85 




165.54 


0.44 


0.87 


P5-2ft 


33.65 


4<S2.29 




104*61 


0.46 


4.43 



(as) Single-stranded. The fluorescence emissions at 538 or 582 run for solutinns containing a final 
cunccntranan of 50 nvi indicated probe, lo thm Tris-J iu (pH 8»3), 50 mM KCL and 10 mu MgCI^. 
(ds) Double-Htmnded. 'ilw solutions contained, In addition. 100 iim A1C f»T pr»hp% Ab7 and 
A)*?Xi. 100 iim A3C for probes A3-6 and A3-24. 100 iim l'2C fur pinlx's 1^7 and 17-7.7* or 100 nM 
T5C /or probe* IM-10 and v&'ZH. Hcforc mc aodtUotl ot Mj^i|», J 20 or each sumpl« was Heated 
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dyes used, .spacing between reporter and 
quencher dyes, nucleotide sequence 
context effects, presence of structure or 
ulhet faciuii that reduce flexibility of 
ihe oligonucleotide, and purity of the 
probe. The second factor is the efficiency 
vt hyhndiiaUtin, which depends on 
probe T m , presence of secondary struc- 
ture In prebe or template, annealing 
temperature, and other reaction condi- 
tions. The third factor is the efficiency at 
which Taq DNA "polymerase d caves the 
bound probe between the reporter and 
quencher dyes. This cleavage is depen- 
dent on sequence complementarity be- 
tween pxobe and template as shown by 
ine observation that mismatches in I be 
segment between reporter and quencher 
dyi*s drastically reduce the d ravage: uf 
probe.'" 

The rise in RQ* values for the Al se- 
ries of probes seems to indicate that the 
degree of quenching is reduced some- 
what as the quencher is placed toward 
the 3' end ihe lowest apparent quench' 
iri£ Is observed for probe Al-39 (sec Fig. 
3) rather than for the probe where the 
TAMRA Is at the 3' end (Ai-zo). I his is 
understandable, as the conformation of 
the 3' end position would be expected to 
be less restricted than the conformation 
of an Internal position. In effect, a 
quencher at the 3' end is freer to adopt 
conformations close to the 5' reporter 
dye than . Is an internally placed 



probes, the interpretation of RQ, values 
is lew dear-cut. The A3 probes show the 
some trend as Al, with the 3' TAMRA 
piubc having a Laiger RQ than the in* 
ternal TAMRA probe. For the P2 pah, 
both probtts have about the same RQ" 
value. Poi the PS probes, the RQ , for ihe 
y probe is less than for the Internally 
labeled probe. Another factor that may 
explain some of the observed variation is 
that purity affcefcj the RQ~ value. Al- 
though all probes are HIM.C put if led, ft 
small amount of contamination with 
unquenched reporter cum have a large ef- 
fect on RQ . 

Although there may be a modest ef- 
fect on decree of quenching, the posi- 
tion of the quencher apparently urn 
have a large effect on the efficiency of 
piobc cleavage. The most drastic effect it 
observed with probe Al-2, where place- 
ment oi the TAMRA on the second nu- 
dcolUie teduces the efficiency of cleav- 
age to almost zero. For the A3, 1*2, and PS 
probes, ARQ is much greater for the 3' 
TAMRA probes as compared with the in- 
ternal TAMRA probes. This is explained 
most cosily by assuming thai piobes 
with TAMRA at the 3' end are more likely 
to be cleaved between aepoilej and 
quencher than are probes with TAMRA 
attached internally. Tor the A1 probes, 
the cleavage efficiency of probe Al-7 
must already be quite high, as ARQ docs 
not Increase when the quencher is 
nUrwi rirwr tn tu* .V end. This illus- 



trates the importance* nf hplng ahlr to 
use probes with a quencher on thfl 'V 
end in the S' nuclease PCtt ar,say. In thh 
assay, an increase Jn the intensity of re- 
porter fluorescence is observed only 
when the probe is cleaved between I be 
reporter and quencher dyes. By placing 
the lupoj ujr and quuiichiM dyw> un the 
opposite endft of an oligonucleotide 
probe, any cloavage that occurs will be 
delected. Whcu the quencher Is uituelted 
to an Lmerual nucleotide, uometiines Una 
probe wurks well (A 1-7) and other tluica 
not so well (A3-6). Tiic relatively poor 
performance of probe A3 -6 presumably 
means the probe U being cleaved 3' to 
the quencher rnthor than between the 
n>pnripr and quencher. Therefore, the 
best chance of having a probe that rcli- 
ubly detects accumulation of PCK prod- 
uct in the S' nuclease l*CR assay is to use 
a probe with The reporter and quencher 
dyCtt on opposite ends. 

Placing the quencher dye on the 3' 
end may also provide a slight boncl'U In 
terms of hybridization efficiency. The 
presence of o quencher attached to an 
Internal nucleotide might be expected to 
disrupt base-pairing anil reduce the T m 
of a probe, in fact, a 2 ft <V-3 ft C rvUuction 
in T m has been observed for two piobcs 
Willi internally utudied TAMUA.i- N> This 
dbfuptive effect would be mini mired by 
placing the quencher al the 3* end. Thus, 
probes with 3 f quenchen mi^ht exhibit 
fflifthtly higher hybridisation efficiencies 
than piobeS will] inter uaJ Ljuesichei*. 

The combination of increased cleav- 
age and hybridisation efficiencies means 
that probes with 3' quenchers probably 
will be more loleianl of mismatches be- 
tween probe and target as compared 
wtlh internally labeled probes. This tol- 
erance of mismatches can be advanta- 
geous, as when trying to use o single 
probe to detect POR-amplificd products 
from .vim pit?* uf diffcienl species. Also, It 
means that cleavage of probe duri PClt 
is less sensitive to alteration* in an* 
nealing temperature, or other reaction 
conditions. The one application where 
tolerance of mismatches may be a disad- 
vantage is for allelic discriml notion. Iuic 
ct aL^> demonstrated that allele-speclflc 
probes wiirc cleaved between reporter 
and quencher only when hybridized to a 
perfectly complementary target- This al- 
lowed them 10 distinguish the normal 
human cystic flbTOsis allele from The 
AF508 mutant, Their probes had TAMRA 
attached to the seventh nucleotide from 
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dyes used, spacing between reporter and 
quencher dyes, nudeolldc sequence 
context effects, presence ot structure or 
other factors that reduce flexibility of 
the oligonucleotide, and purity of the. 
probe. The second factor is trie efficiency 
of hybridization which depends on 
probe 7' m , presence ofsecondary struc- 
ture in probe or template, annealing 
temperature, and other reaction condi- 
tions. The third factor is the efficiency at 
which Taq UNA polymerase cleaves the 
bound probe between the reporter and 
quencher dyes. This cleavage is depen- 
dent on sequence complementarity be- 
tween probe and template as shown by 
the observation that mismatches in the 
segment between reporter and quencher 
dyes drastically reduce the cleavage of 
prohe. a> 

The rise in RQ values for the Al se- 
ries of probes seems to indicate that the 
degree of quenching is reduced some* 
what as the quencher is placed toward 
the 3' end, The lowest apparent quench- 
ing is observed for probe A 1-1 9 (sec Fig. 
3) rather than for the probe where the 
TAMRA is at the 3' end (Al*26). This is 
understandable, as the conformation of 
the '$' end position would be expected to 
be less restricted than the conformation 
of an Internal position. In effect, a 
quencher ar the 3' end is freer lo adopt 
conformations close lo the 5' reporter 
dye than Is an internally placed 
quencher. For the other three sets of 



probes, the interpretation of RQ' values 
is less clear-cut. The A3 probes show the 
same trend as Al, with the 3' TAMRA 
probe having a larger RQ"' than the in- 
ternal TAMRA probe. For the 92 pair, 
both probes have about the same RQ 
value. For the I»S probes, the RQ' for the 
3' probe Is lets than fvt the inlcinftJIy 
labeled probe, Another factor that may 
explain some of the observed variation Is 
that purliy affects the RQ" value. Al- 
though all probes are HPLC purified, a 
small amount of contamination with 
unquenched reporter can have a large ef- 
fect on RQ . 

Although there may be a modest ef- 
fect on degree of quenching, the posi- 
tion of the quencher apparently can 
have a large effect on the efficiency of 
probe cleavage. The most drastic effect is 
observed with probe Al-2, where place- 
ment of the TAMRA on the second nu* 
cleottde reduces the efficiency of cleav- 
age to almost zero. For the A3, ?2, and PS 
probes, ARQ is much greater for the 3' 
TAMRA probes as compared with the in- 
ternal TAMRA probes. This is explained 
most easily by assuming that probes 
with TAMRA al the 3' end are more likely 
to be cleaved between reporter and 
quencher than are probes with TAMRA 
attached Internally. For the Al probes, 
the cleavage efficiency of probe Al-7 
must already be quite high, as ARQ does 
not Increase when the quencher is 
placed closer to the 3' end. This Illus- 



trates the Importance of being; able to 
use probes with a quencher on the 3' 
end in the V nuclease I'CU assay, in (his 
assay, an increase in the intensity of rs 
porter fluorescence Is observed only- 
when the probe is cleaved between the 
reporter and quencher dyes. Uy pluming 
the reporter and quencher dye* on the 
opposite ends of an oligonucleotide 
probe, any cleavage that occur* will Imc. 
detected. When the quencher 1* attached 
to jn liUciual nucleotide, Mjivictlnuis ihc 
probe works well and other times 

not «o well (A3-6). The relatively poor 
performance of probe A3-6 presumably 
means the probe is being cleaved 3' to 
the. quencher rather than between the 
reporter and quencher. Ther*dore r Ihe 
best chance of having a probe that reli- 
ably detects accumulation of PCR prod- 
uct in the 5' nuclease VCR assay is to use 
a probe with the reporter and quencher 
dyes on opposite ends. 

Placing the quencher dye on the T 
end may also provide a slight benefit in 
terms ol hybridization efficiency. 'Ihc 
presence of a quencher attached to an 
internal nucleotide mi^hl be expected to 
disrupt base-pairing and reduce the T, n 
of a probe. In fact a 2*C~3'C reduction 

in T m hus been Observed for two probes 
with internally attached TAMRAs. (9) This 
disruptive effect would he minimized by 
placing the quencher at the 3 f end. Thus, 
probes with 3' quenchers might exhibit 
slightly higher hybridization efficiencies 
than probes with internal quenchers. 

The combination of increased cleav- 
age and hybridization efficiencies means 
that probes with 3' quenchers probably 
will be more tolerant of mismatches be- 
tween probe and target as compared 
with internally labeled probes. This tol- 
erance of mismatches can be advanta* 
gcous, as when trying to use a single 
probe to detect PCR-amplihed products 
from samples of different species. Also, It 
means that cleavage of probe during PCR 
is less sensitive to alterations In an- 
nealing temperature or other reaction 
conditions. The one application where 
tolerance of mismatches may be a disad- 
vantage Is for allelic discrimination. Ue 
et al. ( " demonstrated that allcio-speclhc 
probes were cleaved between reporter 
and quencher only when hybridised to a 
perfectly complementary target. This al* 
lowed them to distinguish the normal 
human cystic fibrosis allele from the 
AF5Q8 mutant. TheiT probes had TAMRA 
attached to the seventh nucleotide from 
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Om* V end and were dedgnod so that any 
mismatches were between the reporter 
and quencher. Increasing the distance 
betwaan ra porter and ou«r\eh*r would 
lessen the dl&rupUvc effect of nils 
matches and allow cleavage of the probe 
on the incorrect target. Thus, probes 
with a quencher attached to an internal 
nucleotide may still be ucoful for allelic 
rliuvlininatkin. 

In this study lo&t of quonehlng upon 
hybridisation wai used to show that 
quenching by a 3* TAMIU In dependent 
on the flexibility uf a single^n-anded oli- 
gonucleotide. The Increase in reporter 
fluorescence intensity, though, could 
alto be uted to determine whether hy. 
brldlzation has occurred or nor. Thus, 
oligonucleotides with reporter and 
quencher dyes Attached at opposite end* 
should also be useful as hybridization 
probes. The ability to detect hybridiza- 
tion in real time means that these probes 
could be used ro measure hybridization 
Kinetics. Also, this type of probe could be 
used to develop homogeneous hybrltl- 
i nation nuayi lor diagnostics or other ap- 
plications. Bagwell Ct aK (10) describe just 
tilts type of homogeneous assay where 

hybridization of A probe cause* an in- 

Crosse in fluorescence couacd by a low of 
quenching. However, they utilized a 
complex probe design thai requires add- 
ing nucleotides to both end* of the 
probe sequvnet! to form two imperfect 
hairpins. lite results presented here 
demonstrate that the simple addition of 
a reporter dye to one end of an oligonu* 
clcotlde and a quencher dye to the oUhji 
p^d generates a fluorogonlc probe that 
can detect hybridization or PCK amplifi- 
cation. 
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We have developed a novel "real time" quamhaiive PCR method. The method measures Kfc product 
accumulation through a duaHabeled fhiorosenle probe (Lc, TaqMan Proba). This method provides ; v«y 
accurara and reprodudbfe quantitation of gene copies. Unlike other quantitative PCR methods, real-time TCR 
does nor require dom-PCR sample handling preventing potential PCR product carry-over contamination and 
resulting In much faster and higher throughput assays. The real-time PCR method has a very large dynamic 
ranee of starting target molecule determination (at lean five orders of magnitude). Real-time quantitative 
PCR is* extremely accurate and less labor-intensive than current quantitative PCR methods. 



Quantitative nucleic acid sequence analysis has 
had an important role in many fields of biologi- 
cal research. Measurement of gene expression 
(RNA) has bcw.n used extensively In monitoring 
biological responses to various stimuli (Tan ct ai, 
1994; HuaiiR el ai. I995a,b; I'rud'bomme et al. 
1995), Quantitative gent? analysis (DNA) has 
lx-cn used to determine the genome quantity of 3 
particular gene, as in the case, ot the human HER2 
gene, which Is .amplified in -30% of breast tu- 
mors (Slamon al. 1987). Gene and genome 
quantitation (13NA and UNA) also have been used 
for analysis of human immunodeficiency virus 
(illV) buTdcn demonstrating changes in the lev- 
els of virus throughout the different phases of the 
disease (Connor et al. 1993; J'latak ct al. J9v:sb; 
J'urrado et ai. 1995). 

Many methods have been described for tin: 
quantitative analysis or nucleic acid sequences 
(hoth for RNA and DNA; Southern 19/6; Sharp ct 
al. 1980; Thomas 1980). Recently, PCR ha* 
proven to be a powerful tool for quantitative 
nucleic acid analysis. PCJR and reverse transcrip- 
tase: (R'0-PCR have permitted the analysis of 
minimal starting quantities of nucleic acid (as 
little as one cell equivalent). This has marie pos- 
sible many experiments that could not hnvc been 
performed with traditional methods. Although 
PCR has provided a powerful tool, it is imperative 
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that \\ he uavsd property for quontitutlon (W»«y- 
maekm 1995). Many early reports of quantita- 
tive PCR and RT-PCR described quantitation of 
the PCR product hut did not measure the initial 
target sequence quantity. It is essentia] to design 
propc-r controls, for the quantitation of the initial 
target sequences (Herrc 1992; Clement I el al. 
1903) 

Kcs*ftrchcx5 have, developed several methods 
of quantitative PCR and RT-PCR. One approach 
measures PCR product quantity in the Uig phase 
of the reaction before the plateau (Kellogg et al. 
1990; Pang et a). 1990). This method requires 
that each sample has equal input amounts of 
nucleic add and that each sample under analysis 
amplifies with identical efficiency up to the. point 
of quantitative analysis. A gene sequence (con- 
tained in all samples at relatively constant quan- 
tity, such as p-aelin) on be used for sample 
amplification efficiency normalization. Using 
conventional methods of k:r detection and 
quantitation [gel electrophoresis or plate capture 
hybridization), it is exiremcly laborious to assure 
that all samples are analyzed during the log phase 
of the reaction (for hoth the taTgcl gene and the 
normalization gene). Another method, quantita- 
tive competitive (QQ'PCK, has l>cen developed 
and is used widely for PCR quantitation. QC-PCR 
relics on the inclusion of an internal control 
competitor in each reaction (Becker-Andre 1991; 
Matak el al. 1993*,l>). The efficiency of each re 
action is normalised to the internal compel itor. 
A irnnwn amount oi internal competitor £an be 
crnnrv 7 oca no/ aha wj «c:frT 7nn7/cn/7T 
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added to each sample. To obtain relative rpiani- 
ration, ihe unknown target PGR product is com- 
pared with the known competitor l*t:K product. 
Success of a quantitative competitive i'CU assay 
relies on developing an linen ml control thai am- 
with the same efficiency us the tut get moi- 
ecule. The design of Tin* compel J toi and the v ni- 
dation of amplification efficiencies, jequire a 
dedicated effort However, because QOl'CH does 
not require that PCR puxlucLs be analyzed during 
the lo$ phase of Hie ;u uplift cation, it is tint easier 
of the two methods to use. 

Sevens) detection system* uic used for quan 
Utative PCK and RT-l*C:u analysis; (1) agarose 
gels, (2) fluorescent labeling of PGU products and 
detection with wiaiTT-induccd fluorescence using 
capillary elttcfrnphorcftta (h*«sco ct al. 199$; Wil- 
liams et at. 1 996) or acrylaiuide geJs, and (3) plate 
capture and .sandwich probe hybrid 1/41 thm (Mul- 
der el ah 1994;. Although these method* proved 
successful, each method requires posl-]*CR mu- 
aipularlons Thar acid time to the analysis ami 
may lead lei htbuiatuty i oitlntrtSiiatiun. The 
sample throughput uf these method* i.s limited 
(w(|)> Ihe exception of the plate capture ap- 
proach), unci, th«n:fnre, these methods ore not 
well suited fuj u.>i,*5 demanding high sample 
throughput (I.e., screening of large numbers of 
1 jIomw!rv.ulc:» wi aualy/.lng Samples fwj did&ilm* 
llcs or clinical triaks), 

Merc we report the development of a novel 
ii.vtay for quantitative DNA analyst.*. The assay is 
I wed on the u.ir-of the ,5' nuclease assay first 
described by Holluud et al. (1993 J. The method 
uses 1 he 5' nuc.lca.ic activity of 7W</ polymerase to 
cleave a n oncx t c nd I bl c: hybridl/mion probe dur- 
ing the extension phase of PCU- Thu approach 
uses dunl-tabclcd fluorogcnic hyhridl/.utJon 
probes (Lcc ct nJ. 1993; jlussler ct ul. 1993; l.ivak 
ct til, l$96a,b). One. fluorescent dye serves «> a 
reporter |PAM (i.c,, <J-Ciirboxyfluore*ecin)| and >1s 
emission spectra is quenched by the second fluo- 
rescein dye, TAMllA (he,, <*j-carboxy-teir«methyl- 
rhodamlnc). Tlic nuclease degradation of the hy- 
brtdivuition probe release* the quenching of Ihe 
I 'AM fluorescent emission, resulting in an In- 
crease In peak fluorescent emission at 51(5 run. 
The use of a sequence detector (ADI Prism) allows 
measurement of f luurescunt spectra of all i/6 wells 
uf rite incrmal cycler continuously during the 
amplification. Therefore, tlie rcuctious uj<s 
monitored m real liiue. The output data is de- 
scribed and quantitative muilysb of input luiget 
DNA sequences 13 discussed below. 
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RESULTS 

PGR Product Dercalon in R«al Time 

The gon) was to develop a high-throughput, sen- 
sitive, and net-uraic gene quantitation assay for 
use In monitoring lipid mediated thctrapouric 
gene delivery. A plusmld -unending human factor 
VU1 gene sequence, pF8TM (sec. Methods), was 
used as a model ihcrapeutic Ktuui. The assay usr* 
fluorescent Taqman methodology ami an instru- 
ment capable of measuring fluorescence in real 
time (Aiil Prism 7700 Sequence Dctrrlnr). The 
Taqman reaction requires n hybridation pmhr 
1 allied with two different fluorescent dyes. One 
tlye is a reporter dyw (VAM), the otKcr ix X quench- 
ing dye (TAM&A). When the pmU: \s intact, fluo- 
icsccni energy transfer occurs and the reporter 
dye fl uorcac.cn t emission is absorbed by the 
quenching dye (TAMRA). During Die extension 
phase of the TCK cycle, ihe fluorescent hybrid- 
l/wiUon prol>c Is cleaved by tbe S'-'-Y nuclcolytic 
activity of the: DNA polymerase. On cleavage of 
the probe, the reporter dye emission is no longer 
transferred efficiently to the quenching dye, re 
sultiitK hi un Increase of the report or dyu fluores- 
cent enit.ision np*etro, l J Clt primers and probuN 
wrere designed foi lliu huinan f«iclur V J 1 J se- 
quence and human p-»ctln gene (a.t described in 
Methods). Optimization reactions were per- 
formed to choose the Hpproprlutc probe und 
magnesium concentration* yielding the highest 
Intensity of rejxDrtcr fluorescent signal without 
sacrificing specificity. The Instrument uses a 
cliftr^e-couplcd device (i.e., CCD cajneru) for 
measuring the fluorescent emission apeetn* from 
SOO tf> i$$0 mil, Mttch VCAX tube was monitored 
sequentially for 2& rn.suc wllh ct>fitinuous moni- 
torinjc; Ihrou^hout tin: amplification. Uacll lube 
was rr-exaniirjed every B»5 sec. Computer K>f(- 
wnre. was dc^i^ned to examine the fluorescent In- 
tensity of both the reporter dye (FAM) . and 
the quenching dye (TAMilA). The lluoresccnt 
intensity of the quenching dye, TAMUA, changes 
very Utile over the course of the PCR ampllfl* 
cation (data not shown), Therefore, the Intensity 
of TAVfllA dye emission serves hs hm Internal 
.standard with which to norm ul bus the reporter 
dye (1 ? AM) emission viirint-Jons. The software cal- 
culates a vdlue termed AKzi (or AftO) using the. 
folJowiiig equation: ARn - (1UV) (nn"), where 
Un 4 . emissloit iijlcnsity \>t reporter/emission in- 
tensity of quencher at any given time In a re«e 
rioti tube, and ftn r- emission intensitity of re- 



From : EML 



PHONE No, : 310 472 0905 Dec. 05 2002 12I22RM Pi3 



Ht ID N AL 

poncr/emlsslon Jmemily ^ quencher measured 
prior to rCK amplilication in Hint same reaction 
tube. l ; ar tlic purpose of quantitation, the last 
three data points (ARm) collected during the. e*- 
tension step for each PCk eyele were analyzed. 
The nudeolytic degradation of the hyurub/aiion 
probe occurs during the extension phase or rtat, 
and, therefore, reporter fluorescent uiimmuii In- 
creases Uuring this time. Jin: tlncc data polntn 
were averaged for each K-IH cycle and the uivnn 
value fur each was plotted in an "amplification 
plot" shown in J'itfurc 1 A. Tlic AKn mean value is 
plotted on the }A.axJ$, and time, represented by 
cycle number, is plotted on thv*-axift. During the 
early cycles of the VCil amplification, the ARn 



value remains at base line Whf-n sufflclenl' hy- 
bridization probe lias been cleaved by the Ttit) 
jxilymeraste nudttifiC activity, the intensity of re- 
porter, fluorescent emission inerett*et>. Most W'ls 
umplifi^ljons reach u plateau phono of reporter 
fJuurwcwil emission if the reaeliun Is carried mil 
to high eyele uiuiiK-i*. The amplification plot \'J 
examined vaily in lh* reaction, at a point Ihfll 
■ (.•presents ihe log phJIW of product arniinula* 
tion. This Is done by assigning an aibiUury 
threshold thai is baaed on the variability of the 
[>as«-iinedMU. In Figure 1 A, the lhr<*hald wmssci 
ai 10 standard devintlonN above the mean of 
base line emlaiton calculated from <ydui> 1 lo 1 S. 
Once the threshold is chosen, the point at wWr.h 
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Haure 1 PCR product detection in real time. {A} The Model 7700 >u(lware will construct amplification plot* 
from the extension phase fluorescent emission data collected during the PCR .mpHJcaUoo. The standard de- 
viation is determined from the data points collected from the base line of the amplification plot C.^ , values are 
calculated by determining the poim at which the fluorescence exceeds a threshold limit (usually 10 times me 
tSSSi devotion of the base W « Overlay ot amplification plots of serially (1:2) Jiimn genome 
DNA samples amplified with p-actin primers. (0 Input DNA concentration of U«_«*P|« Ki^llhuix 
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the amplification plot cro&seo the* ihrcsholcTis-ele 
fined as C r . C r is reported u* the cycle number 
this point. Ar will be demon st rut «d, I hit Cl f .value 
lj* piedicu've of the quantity of input target. 

Cj Values Provide a Quantitative Measurement. o>' 
Input Targer Sequences 

Figure IB shows amplification plots of 

ejil PGR amplifications overlaid. The amplica- 
tions were performed on a 1 :2 serial dilution 
human genomic DNA. i*hc amplified targei wa* 
human p actln. The amplification plotv xhifl to 
the right (to higher threshold cycles) n* the input 
tAfgot quantity reduced, 'i*his is expected he- 
eauKU nmctlurfK with fewer starting eopins of tile 
target molecule require greater amplification to 
degrade enough probe to at rain the Threshold 
fluorescence. An arbitrary threshold of 10'stan* 
dard deviations above the base line was used to 
determine the O r values, Figure 1C represents the 
Cy values plotted versus the sample dilution 
value, Each dilution was amplified in triplicate 
PC'.R amplifications and plotted as in can value* 
with error bars representing one siandard devia- 
tion. The C r values decrease linearly with Increas- 
ing target quantity, Thus, C r valuta can be used 
as a quantitative measurement of the input target 
number. It should be noted that the amplifjca- 
lion plot for the 1 5*6* tig sample shown In Figure 
1H does not reflect the same fluorescent rate of 
Increase exhibited by most of the other samples. 
The 15.6-ng sample also achieves c.ndpoinl pla- 
teau at a lower fluorescent value than would he 
expected based on the input DNA. This phenom- 
enon has been observed occasionally with other 
samples (data not shown) and may be attribut- 
able to late cycle inhibition; this hypothesis is 
still under investigation. It is important to note 
that the flattened slope and early pjatcau do not 
impact significantly the calculated O, value us 
demonstrated by the fli on Die line shown in 
Figure 1 C All triplicate amplification* resulted in 
very similar Cr values— the standard deviation 
did not exceed 0.5 for any dilution, this experi- 
ment contains a > 1 00,000-fold range of Input tar- 
get molecules. Using C v values for quantitation 
permits a much larger assay range than directly 
using total fluorescent emission intensity for 
quantitation. The linear range oi fluorescent in- 
tensity measurement of the AIM i*rlsm 7700 *>c- 
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meiits over n very large r;mj»<» nf rflativo siarMnp, 
target quantities. 

Sample {Reparation Validation 

Several parameters influence the ofliclrnry nf 
PCM amplification: magnesium and suit concen- 
trations, reaction conditions (i.e., time and tem- 
perature), PCK largct size and composition, 
primer sequences, and sample puriry. All of rlic 
above (actors are common to a single VCR assay, 
except sample to sample purity, in an effort to 
validate the. method of sample preparation for 
the laeior VI Jl assay, VCR amplification reproduc- 
ibility and . eil'lciency ol 30 replicate sample 
prejwrations were, examined. After genomic DNA 
was prepared from the 10 replicate samples, the 
DNA was quantUatcd by ultraviolet spectroscopy. 
Amplifications were performed analyzing p-aciln 
gem: content in 100 and 25 u% of total genomic 
DNA. Each VCR amplification was performed in 
triplicate* Comparison of C r values for each t rip* 
lieate sample show minimal variation based on 
standard deviation and coefficient of variance 
(Table 1). 'Iliercforc, each ol the triplicate VCR 
amplifications was highly reproducible, demon- 
strating that real time PCR using this instrumen- 
tation introduces minimal variation Into the 
quantitative. I'CK analysis. Comparison of the 
mean Oj values of the 10 replicate sample prepa- 
rations also showed minimal variability, indicat- 
ing that each sample preparation yielded similar 
results for ft-aclin gene quantity. The highest C T 
difference between any of die samples was 0,55 
and 0.7] for the ](K) and 25 ng samples, respec- 
tively. Additionally, the amplification cjf each 
sample exhibited an equivalent rate of fluores- 
cent emission intensity change per amount of 
DNA target analyzed as Indian cd by similar 
slopes derived from the sample diluiions (Pig. 2). 
Any sample containing an excess of a I'CK inhibi- 
tor would exhibit a greater measured 0-actJn O r 
value for a given quantity of DNA. In addition, 
the Inhibitor would be diluted along with the 
sample in the dilution analysis (Hg, Z), altering 
the expected c; r value change, Each sample am- 
plification yielded a similar result in the analysis, 
demonstrating that this method of sample prepa- 
ration is highly reproducible with regard to 
sample purity. 

Quantitative Analvsis of a Plasmid After 

7ncfl no/ wj «c:frT 7nn7 /cn/7T 
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TahU 1 . Reproducibility of $«mpl« Preparation Method 



I 
2 
3 



7 
8 
9 
10 

Mean 



100 ng 



Samplo 

no. C T 



standard 
mean deviation 



CV 



18.24 
18.23 

18.33 

18.35 

1R.44 

18.3 

18.3 

18,42 

18,15 

18.23 

18.32 

18.4 

18.38 

18.46 

18,54 

18,67 

19 

18.28 

18.36 

18-52 

18.45 

18,7 

18.73 

18.18 

18.34 

18.26 

18.42 

18.57 

0 io) 



KJ.27 
IB A? 
18.34 



18.39 



18.55 
18,12 



0.06 



0.06 



0.07 



18.23 0.0S 



18.42 0.04 



0.24 



0.12 



18.63 0.16 



18.29 0.1 



0.12 
0.17 



0.32 

OA? 

0.36 

0.46 

0,23 

1.26 

0.66 

0,83 

0.S& 

0.65 
0.90 



20.48 

20.55 

20,5 

20.61 

20.59 

P0.41 

20.54 

20.6 

20.49 

20.48 

20.44 

20.38 

20.68 

20.87 

20,63 

21.09 

21.04 

21.04 

20.67 

20,73 

20.65 

20.98 

20.84 

20.75 

20.46 

20.54 

20.48 

20.79 

20.78 

20.62 



25 ng 



standard 
mean deviation 



20.51 



20.43 



20.86 

20.51 

20.73 
20.66 



0.03 
0,11 



20.54 0.06 



0.05 



20.73 0.13 



21.06 0.03 



20.68 0.04 



0,12 

0.07 

0.1 
0.19 



cv 

0.17 

0.54 

0,28 

0.26 

0.61 

0.15 

0.2 

0.57 

0.32 

0.46 
0.94 



(or containing a partial cDNA for human factor 
viil, pl ; 8TM. A scries of tr;i infections was sot 
up using a decreasing amount of ihc plasinid\40, 
4, 0.5, and 0.1 u,g). Twrniy-rour hours po.sl- 
transfectinn, total DNA was purified from each 
flask uf cells . p-Aclin £cnc ijuttJitity was cliux'i i <«s 
a value for normau>.ai iwn of xwtvmW. DNA con- 
ccjirnnJou from each sample, hi this cxpeiinient, 
(5-acun rciic content should remain constam 
relative to toral #cnumie DNA. H^urc H show* the 
result of the p-actin DNA measurement (100 ng 
total DNA determined by ultraviolet spectros- 
copy) Ot each sample. Kach sample was analyzed 
in triplicate and the mean |i-actin C<j values of 
the triplicates were plotted (error bars represent 
c*'*»ifici*ri Htnft^itnni I hp hlPhwcr ciiffrrrnrr 



betwvwu atiy 1wo samplct moans was 0.<»5 C,- Ten 
nanograms of total UNA of each sample were also 
examine*! for |VacUn. The results a^ain .showed 
that very similar amounts of genomic 1>NA wore 
present; 'the maximum mean p actio C:, value 
difference wha 1.0. As l*'igurc 3 shows, the rate of 
P-actiu CJ r dja/i>;v between the 100 and 10-ng 
stur^e* was similar (slope values r;m$« rnirwocrn 
3.56 and - 3.45). This verifies again trwit thir 
method of .sample preparation yields samples of 
identical PC.R integrity (i.e-, no sample contained 
an excessive atnuunl of a PCft Inhibitor). ITow* 
ever, these results indicate that ench sample con 
talned slight differences in the actual amount of 
gwiumlc 1>NA analy/cd. Determination of actual 
«uuumic ONA concern 1 ration was accomplished 
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Figure 2 Somple preparation purity. The replicato 
camples shown In Table 1 wore also amplified In 
tripicate using 2S ng of each DNA sample. The fig- 
uie shows die input DNA concentration (100 and 
25 ng) vs. C, In ih#* liQnrp. ih* 100 and 7.1 ng 
points for each sample are connected by a line. 



by plotting the mean fi-actio O, value obtained 
for «at!li 100-iig sample on a ft-actln standard 
curve (shown Ira I'Sh* 40). Hie actual genomic 
DNA concent rail"" of each sum pic, «, was ob 
tallied by extrapolation to lliu uxii, 

Figure \ A shows the mcysurcd (l.u., nun* 
normalised) quiifiLllie* uf /actor VJJJ plnamid 
1WA (pPSTM) from each of tin: four transient cell 
lr«iri>fc!cUom. Each reaction contained 300 ng of 
total sprnpta 15NA (aa determined by UV spectros- 
copy}. VacU sample was uualyzcd in triplicate 













w . 


i 


23 
















20* 



ofifl iTM trofistooto d 

A 0.1 pQ I 



14 



M 1.8 
log (ng Input DNA) 

Figure 3 Analybl* uf tidii&fected crJl DMA quantity 
and purity. I he DNA preparations of the four 293 
cell transections (40, 4, 0.5, and 0.1 ng of pF8TM) 
were analy7ftd for the 0-actIn gene. 100 and 10 ng 
(determined by ultraviolet spectroscopy) of each 
sample were amplified in triplicate. For each 
amount of pF8TM thai was transfectcd, the (3-actln 
C T values are plotted versus rhe total Input DNA 



*l>e:r< amplification*. As shown, pl*'8TM purified 
>ftuic Jbc 293 colls decreases (mean C, values in- 
enULsi't) with decreasing amounts of plasndd 
itrumii'LtCCL Thw mean C L values obtained for 
prWM 'inTlgufC 4A wore plotted on u slundurd 
curve comprised uf seilally diluted pKHTM, 
shown .in figure 4R. The quantity uJ plWI'M, to, 
found in each of the four transections w;is de- 
termined by extrapolation to the jraxtfc uf tbo 
standard curve In lUgurc 4B. Those uncorrected 
values, b, for pHfi'M were nor mailed to del er- 
mine the actual amount of pl'8'iM found per 100 
riK <>f gnomic DNA by using the equation:. 

/> X 10 0 ng actual pl-BTM copies oer 
a T 100 ng of genomic DNA 

where a actual -genomic HNA in u .sample and 
b >- pFBTM copies from the standard curve. '11 >o 
normalised quantity of pI'BTM per 100 ng of ge- 
nomic ONA for each of the four IranafccUona la 
shown in Figure 4JJ. 'Hicm: roullii A how til a i the 
quantity of factor vm plasuUU iissovJated wiiii 
tMC Z93 cells, 21 lir after irun.sfvclicin, di:i.ji:.ises 
with decreasing pJw^mdJ uiiiiAtiHiatjon u.%ed in 
tile lram/ctnk>n. 'Hu: quantity of pi'8'i'M nwocJ- 
utco witli 293 celb, ciftcr trunsfectlon with 40 u£ 
Of pliisniid, was 35 pg p<:r 100 ng genomic UNA. 
Tills results in -520 jdasiiild copies per cell. 



WSCUSSJON 

We have described a new method for quantis- 
ing gene copy numbers using rcaMlmc analysis 
of PCR amplifications. ReaMlrac PCK is compat- 
ible with cither of the two PC:k (KT-PCR) ap- 
proacho: (1) quantitative comfietitive where an 
inteuial cumpclHor for each target .sequcfjee i» 
used for norm alidad on (data not shown) or (2) 
quantitative comparative PCK using n uwmialtza- 
ttou gene conlained within the sample (i.Ci, (3-nc- 
tjn) or a "housekeeping" gene for RT-PCK. ff 
equal amounts of nucleic ucld are analyzed for 
each sample and if the amplification ef/kiency 
before quantitative analysis identical for each 
sample, the iTirernai cujjIiuI (iiwimaliyatioii jjene 
t;r competinjr) should give equal glials for all 
samples. 

The real-time PCU method (jffers several ad- 
vantages over the other two method* «:urrcntly 
employed (sec the Introduction). I : irst, the real- 
time PCR method is performed in a doscd-tubc 
system and requires no post-PCR manipulation 
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Figure 4 Quantitative flnoJyBi* of pFSTM in transfcctcd tcNi. (A) Amount of 
plasmid DMA used for the trunsfection plotted a gain si the nnsun C, value deter- 
"ra$£ for P f STM remaining ^ hr alter Iransfectfon. (0 y Q Standard curve* of 
pUHIM and £-acdr>, respectively. pf8TM DNA <fl) and genomic. DNA (Q were 
dilutftd * Artally 1 ;S beforft amplification with the appropriate primer*. The p-actin 
standard curve wa* usod to norma li>e Ihc results of A to 1 00 ricj of genomic DNA. 
Tho amount of pF8TM present p«:r 100 ng of genomic DNA, 



of sample. Therefore, I lu« potential for PGR con- 
tamination in the laboratory is reduced because 
amplified products can lw analyzed and disposed 
oi without opening the ruaction tubes. Second, 
this method suppoxU Uivr um? of a iioriiuiIjy.<itK>ii 
«enc (i.e., (3-nctin) for quantitative. PCR or house- 
keeping genes for quantitative RT-l'CR controls. 
Analysis Is performed in real time during the Jog 
phase of product accumulation. Analysis during 
k>K phase permit* many different genes (over a 
wide input target range) to be analyzed simulta- 
neously, without concern of reaching rcnti ion 
plateau at different cycle*. This will make mull I- 
gen* analysis aasays much cm Km lv/ develop, be- 
cause individual internal unnpctUui* will not l>c 
necded for coch gene under analyals- Third, 
sample throughput will n«.ieasc druritalicdlly 
with the new method because there is no |>ost. 
IX'M processing time. Additionally, wen king In a 
format is highly compatible with auto- 
mation technology, 

The real-time PCR method is highly repro- 
dudble. Rcpilcaia amplifications can be analyzed 



for f-ach sample nUninihdng jKMcntlal error. The. 
sysiiriri allows for a very large assay dynamic 
rung? (approaching 1,000,000 -fold Marling tai- 
gel). Uaing a .standard curve for the target oi in* 
teresi, reJutivc copy number values can be deter- 
mined for any unknown ?>umph\ fluorescent 
threshold values, G r , courJair. linearly with rela- 
tive UNA copy number*. Ileal time quantitative 
HT-'PCU methodology (Gibson et al., this Ijcsuc;) 
ha* alio been developed, finally, real time quan- 
titative I'Cft methodology can be used to develop 
high-throughput screening aaaay* for a variety of 
applications [quantitative gene c*f>j e^dion (KT- 
rCR), ftcne copy aaaaya (Itcr2, IJ1V, etc.)/ Jjcno- 
typlng (knockout mouse, analysis), and humuiio- 

ronj. 

Real-time POW may al.to Ik; jwrformcd using 
intercalating dye* (Hlguchi ct al. WW) such as 
C'JiJdium bromide. The fluorogenic prohe 
rticthod offers a major advantage over inter- 
calating dyes- greater specificity (i.e., primer 
dimers and nonspedflc PCR products are. noi de- 
tmed). 
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METHODS 

Generation of <t Plasmld Containing a Partial 
cDNA Tor Human Factor YI1I 

Total HNA w<o harvested UtNAw'i 1* Oom Tel Test, inc., 
rDCndsveood, TX) from cell*. li*tufcclwl wtth a factor VI 11 
rxjiitagJuu vtjclor, pC:jSZ.lk*ft&lJ (Kaum el id. WHO; Con 
man ct al. 1900). A factor VIII partial chNA wpiwuv WtlS 

^tnomicd ty in* i<:o*wAnip la iTih itNA i>r.u Kll 
(pan NKOR-or/y, rt Applu-o Uiosywms, Vostvi City, c;a)J 

using the I'tni j>i'iuivrs IVfor mul Wrcv (prinii'f .^qnence* 
art* shown below). Hie ampHcon ww roampliuVd OAinR 
modified I'tHof and Wrcv primers (apix'iuli'd with hawlll 
and HittdUl restriction sire sequence »i iJiv h' ei'th and 
Clonal into pO'KKf- 3Z (Promina tiorp,. MikImOD, Wl). Tlw 
result log done, pPSTM, was uwtl l«r transient transfectlon 
of 2W cell*. 



Amplification of Target DNA ami Duiecifon of 
Amplicon Factor VIII Pbsmld DNA 

(pK8TM) was *iini»llfunl with the i» Intern l**Bfor 5'-CX;c:- 

(mic;(^\ACiAtJ:ixjAtXiiO , TC , ..3' and M»rev .v-aaa<:c;t- 

1 ■AOCXrrGCiATCiti'rAOCl-.l'.Hic ivnciUui pivdutrd k 422- 
iip k':k product. The forwurtl primer was dobtuud (u uv 
ngnl/.e. u unique M'tjuriui* {imud lit l he .V untranslated 
region of I In.: patent pGU»2.tk23l> pl^NitKl find therefore 
does live H'vwK'iL^v: «i*id amplify the human factor VIII 
gem\ I'riiMort; woro choMiu with the smivt«Mfrr* nf ih* eom. 
pulcr program Oligo 1.U (Nutimr.d Uiu^cicnees, lne„ Ply. 
mouth, MN). The human p-actl« gene was amplified with 
llic pruuer* 0-t« tin forward primer S'-TCACCCAOACrTCIT 
GCCCATO*Af:C;A-.V and p-actin reverse piirnor .V.< \M .*. 
CG0AACCX;(riX:Aric;c:cAAJ'G0-3'. The reaction pro- 
(Kieeci a 295 np i*c;u product. 

Amplification reactions (50 (J) coiuhjul^I a DNA 
sample, ICIX WAX liuffar II (a u,l), 200 dATP, JCll', 
dGTP, and 400 p,* rill'IT, 4 im< MgCI 7 , l.^S Units Ampll 
Tm) DNA polymerase, 0,5 unit AmpKrnsc uracil /v-fjiy- 
t.ii.iyluM' <UNC), &0 pmolv of each facto* VIII ]tr)iiu.-i, und 1 £ 
pti!<>!<* of \wtt;l» ft act In p< liner. Th« learkloiwt idwi t:onlalncd 
000 Of the following d<'U'C'tlnn prnl>es (100 nu rneh)j 

j'»prt»be A'(WAW)Ac:frrfr] , c:fu<:crr<if.-n , trm'<:rcrr- 

GCCTT(TAMRA)p J' ducf p-nctm probe 5 r (TAM)ATOCX:c:- 
X(TAMKA)CCCCCATCt:CATCp..l' where p indicates 
phrtflphoryliil irtn nnd X indicates a linker arm nucleotide. 
Reaction luK-5 wrn- Mit:nsAit\p Optical Tubes (pari num* 
IhtNKOI OO.l.^, Perkln Ulniur) tiwi wore fro Med (Mt IVrW»i 
Timer) to prvwil 1 15^1 from /eflecilng. Tube cop* were 
ilmitov to Mi(*n>Atitp tinjw hut specially dciigncd to pre- 
vent Ugl a scitt ten 1 15. All <A 1 1 IHIU vUMi«urn(il>U-* wcru «u>«- 
plicvl l<y PK Applied Uivfiyntcno (|'o»K*r C.Uy, Ck) except 
the fuctor VIU prliuen, wliielt weie iynllicsl/c*d at Ccnen 
lech, Inc. (South rrtinclsco, CA), Probes wrw desiynwl 
using the Oliyo 4.0 software, folIowiiiK guidelines mik* 

iiCMco in tnc Model 7700 .Sequence: l>uuH-u»r liwutnneiii 
manual. Hrlcfly, prube T m danAd lie At least 5 W C hlfjl^er 
man Mr ^nncutlux leini^'Mture u.ied during Ihrnntil ey- 
rhtigj primers sli«\iltl nt>l fuin* duplexed wftli the 

probe. 

The thermal cycling condition* Ineludvd 2 niln ftt 
50*0 and 10 niiii at 95"C. Thermal cycling nrorrrcled with 



RIAL 1IML pUANIHAIIVI IX;i< 

reactions were performed in the Model 77(11); Sequence IV- 
linlor (PU ApplU'd UiosyKlvuiv), mhlrh conuliis a Ocrte. 
Amp Syswm V<AM\ lUtactlon condition^ w<-rr- pm« 

ftrumntcU on .1 IW« Macinti»li V10C1 (Apple Uimpntrr, 

Sonta Clara, t^\) linked dtnxity to the Model 7VfK> 
cjucitw IXUector. AnalyvU »t data w»v »lw.i perfnrmKl nti 
the MNi lntr«h comp\iter. CVkllndUm and analysis coftwarc 
wt» dcveln|wl Ht IT-: Applied fttctfyMums. 

Transection of Cells with Faaor VIII Coni-trucl 

j-Vmr -ri7.S ttaski of 293 cells {XWX: CR\. J573), a human 
feiol kidney sueipeti^on cell line, wvre grnwu to 80% con- 
lUteney tranafcrted plVl'M. Cells were K r<)W » l» tlw 
following media; 5C)% HAM'S HI 2 without CUT, 5CW» lt)W 
iducose llitltx'wn's mod If led Ka^le medium (l^MKM)with' 
otn glyrini: wiUi sodium bicarbonate, 10% letal bovine 
scruiji, 2 iiim L-j;Jul<iimnc, And 1% penirilliu-jdrcpinmy- 
win. The media vym diai»fjcd 30 mln Wo«' *»»e transfee 
tion. pl : tl'rM DNA amount* of 40, 4, OS, .ind 0.1 vm; were 
itUdwl to 1..S ml of o solution contalnlnfi 0,1 25 m (u*0 ? : 
and 1 X IIW'US. The four mixtures wore left al rt>o)n tem- 
l^cmturi' fin TO mln and then ad<lv»l rfnipwljw* to tl%c cells. 
'Hie n*i>k» wvi%* mv-uLwlod at 37°C'. c»nd 5% CO. for 24 hr, 
washed with PUS, c*»»tJ n\iu»pe.ndcd In PUS. The M'Him 
|A*ndv^l celb were divided into ttli*(uol)i and DNA WA4 
tr>u:ted Inimcdiulcly uxiiiR Ihv QIAu/up RiinKl Kit (Qi^pen. 
Ui(tumvrtli, CA), ONA wns (;luled Into 200 p.1 c»l 30 n«u 
TrWICJ ulpll ».a, 

ACKNOWLEDGMENTS 

We thank t'ienentech's DNA Synthesis <:roup for prlmrr 
synthe.si» and Cionentech's Ciraphin* flmnp for assistance 
with the it}; tire* 

The puWlcntlnn rnxra of this urtlcle were defrayal In 
' p»rt i>y iwyinenl of pa^e charges, Th\s arttclr must \herc- 
fore he hereby inurked "advertle*rn*nt" in acconUitri* 
with 18 USC: .unction 1/34 solely to indicate This ran. 



REFERENCES 

Kapler, 1I-A., S.J. Hood, K.J. nvak, J. Murmjiro, li. Kimu. 

ana c:.a. hhii. j^v^. or a nuorogeme pmlx* in a 

PCK-baseU ussay tor Uie oeiernon of UstcrJa 

11 Ktnocy tog en cs. App* Zttvttvtu MfauMuL Al: .1 724-3 72U. 

hi LKei-Andrci M. Quantitative evaluation of 

m«NA luvels. WcfM My!,' all. DU 2; I(W 20J. 

dementi, M., S. MeitMi, P. lUigrttm-IU, A. Manxln, A. 
VmU-;<^, «md P.R. VurnJdo. QiumliUtive 1»CH njid 

UT.I'C.'K In viroIoKy- |ftevle.w|. /V*Vi Ueilimh Awfir.. 

Connor, 1U„ H. WoJul, V. Cm, and 0,1 5. J Jo. 10W. 
Increiisctl viwl burden and cytoputhicity COfffllalo 
icinporuhy with CD 4 t T-lyanplioeyie decline and 
t link id proy,rev»iem in hvunan Immtiniwleficienej' viniK 
tyjw MiifcuteU indiviUu*il.v /. ViroL 67: 177M77V. 

T'-tton. D.U W.I. Wood, I). -Eaton, IM'.. Hass, 1». 



From : BnL PHONE No. : 



HFID 11 AL 

vehar, and <;. Oonntm. ti»a6. cx»*uuctio»i mid 
c.harac-KWJWlon of an ««« factor VM variant tarUbiR 
the ccinml nnr ihird of the Hiolcrtilr. BiVir/Vmitfry 
25; 8343-8347. 

ifcsco, MJ„ CI*- Trcanor, 5, Spivacfc n.U t'lgfcc, and us. 
Kaminsky. 1995. Quantitative XNA-|K>iymcrase chain 
reacticni-UNA analysis by cuplltory vU*.lie»piwwests mul 
hwimluecd Huorcsccncv Anal, Mx/rrM, 224: 140-147. 

florrc, I'. 19V2. Quantitative nr snui-quuotiUtivc i»c:tt: 
KAallty veniM myth.PCff Mctltarb Appltc. 2; 1-9, 

ftirlario, M.IL. i-A. y/Higiti-y. S.M. Wollnsky. 
Chants ihc viral mKNA rwprrvdon pattern ivrrtltte 
wiih a rapid rwtc of CM a "lVcc-ll nnmhwr rirriine In 
human immunodoflcioncy vims typo MiifnlHrl 
individuals, A WW- 6»: 

Cibaun, UXMv C,A. Moid, »m1 P.M. Williams. 1*W A 
now* method for rwd iimo qu*nt<t*Hw competitive* 
Itr-Pcat Genome Km. (tliix i»uc). 

c'luMtian, CM., n.U. Ciies, and t;. McCray. jooo, 

]'<jiulciit production of prolelni minimi adenovirus 
ItrtOAfcifuicd cell line. UNA Prot. En$i». I'<*'h. 2: 3 10. 

iiigucliL, it, Zollinger, P,S. WaUti, and H. Crifftlli. 

Simultaneous «niplif5*»tlon and delation of 
npccMv UNA auijucilCCS. fliatrtluiulofiy lOl 41 ?l 417. 

Holland, KM., R.D. Abmmson, It WM sun, and 1>.]I. 
CicJfund. 1991. j-kriocilon vf sj^-JOc polyuwr.-t-a- rhrtlrt 
reaction product try uiiIIxIhk the 5* — .V e*onuclv»»c 

iiiUlVliy Ot r TJi«ruiu» ttv|H(Mlt.uA UNA poIymurt&C. P/TV, 

XML Acatl. Sir. 8$: 7196-7200. 

j 

Huang, S.K., li.Q. Xiao, TJ. Kieine, i..*- I'svurtti, 
Marsh, t-M. Iichienstriti, and M.C. Uu- tWAa. 1U13 
expression ;j» the sites of allergen challenge *« patients 
with asthma. /. Imtmw. 155: 7/»fta-2694 t 

Huhur, 5.K., M. Yl, E. l'alnier, and D.C.. Marsh. 1V9M>. A 
dominant T cell rccejitor beta-chain iiy response to a 
Short ragweed allergen* Amb a 5. ltm?i\m. 
61 57-61 C2. 

r 

KcllogR, D.E., J J. Snliuky. and S. Kowt 1990. 
QumiliUtien of UIV- 1 jmnnznl DNA Tcdaiiv* lO CCll«Ur 
UNA hy the polymerase chain reaction. Anal. RtocUcm. 
1*9:202-208. 

Lee. J-"-, C.K. Ounncll, and W. UJoch. 1903. Allelic 
discrimination by nick-iranaltttiun I'CU wltli fluorofjonlc 
prober Nuclctc Acids Kr>. 21; 37^t-?,7^6.' 

Livak. KJ., SO. Flood, J. Mannaro. w. UUMt), and K. 
Dectz. 1995a. OllgonudetnUtes with fluortwcuTtl dyes »v 
oppoNilc ends provide a queuclicd probe system uMrful 
for tUntH,-iu»K priiduci «nd ntiHHr m 'hI 
bybfhtortuii, 1^ Mrf/io^.t A/iyWic. 4t 3£7 362. 

Livnlt, K.J.. J, Manuam, «lid J.A. T«dd. !00M>. Toward* 



310 472 0905 Dec. 05 2002 12:26AM P19 



fully oulomatcrl e/Miamo.widr |>oiymon)blsm serpciiiiw 
\\M\ut\ Nature flnid. 9t M1~&4?» 

Mulder. Jo N. MwKiincy, C. tJhnstopbci-vin J. Sitlriaky, 
u UKvnncirt, and $, Kwoic j wt. fcipld and »(»«plc 1>CH 
nwy ftw <3imntHi»**on Hunan Immtntivlcflclcncy vinw 
iy|>o I UNA hi plasma: Application to aaitct retroviral 
infection. ; Clin. Mirrvbiol. 32: JU0. 

Kang, S., Y. Ko^nagi, S. Mlh% C Wtloy, U.V. Vlntcrs, 
niiii US. Chcii. JUrIi J<'vnU of luiJulcgr-nwl HIV-1 
DNA in hraini iissuvuf AIDS dementia patjVni.v Mifwir 
543; 85- $9. 

riati)k, M.J., I.Uk, 1*. WHJii»in.v -nd J.D. Ufcmi. 

]yy,^a. ^uautliailvo cnnifKstuiw pulymcra.\e, criam 
iviniion lor acairmt; quanihaUon «i J"V t>NA *i»d ^ 
specie*. Hia ret luiUiUts 14: 70-Hl. 

Pltitak, MJ.. M s .^id^ uc. Yung, SJ. Clark. J.c Kappcs, 

iMk. JS.H. Hpliti, IV. M. M1.1W, and J.I.>. Uf>uii. J W.MIj. 
IH^Ij ievetS O! HIV- 1 ill plasm*. during all si*ys% o( 
infection oeicrmiMwl \r? comperitivi* w:k |.wu 
Ctii-iinr-nul. 5rf<tf»v SS?>: 174C-175-4. 

Hiud'iiuniiiu*, tij.^ D.ll. Kono. and AJM* jflCOflbUKiulnri. 
iWi. Quanlttiitivc jK)lyincraw cbaia reacilon (inalysti 
rcvr*)i marked uvercKpn^o" of inifirieukin- 1 b*-1n # 
iiuerieukin-i and lntcrfcruu^niMifl hiKnA in the lymph 
n(jdi^ i)f Iupu>*pu>nc niiii*. V/t>Z. Unniuitoh 32: tOA-SOS. 

Racymackcrs, I.. 1S»9fi. A cwnmcntftry mi ihp practical 
ttjij/lUntlnn.i u'l wiui^stlUvt* J^.U. Gwome Rw* a 4. 

Slicij*. I'.A., A J. Dcrk, and S.M. HcrfcOt. J9BCK. 
Tfaiiacri)>tinn map* of odcnovlrun. W^r»0^< 

SUnun, l>J., Cr.M. Clark, S.C. Won^ WJ. U-vin, a. 
ulirlt-h, and W.U ML-tTiare. J9H7. Muiiwn breast cant-vr: 
Correlation o( tcIhj>sc and sirrvival with ampHfiratioi) of 
the I MU-2/neu oncogene. Scteucc 2W; 1 77- 1S2. 

SnutJicm, J-M- Udwtion f»f specific laujuurax-i 

flmon R ONA fra^ents sqwratcd by electrophoresis. 
J. Wo/. JIM. 9lfc.S<fc-517. 

Tan, X., X. 5un f iik Cioii2aiez, srit! YV. Hiucii. v.***, Inl- 
and 'nsi f ITlcrcn-ie the ^jwt.uf«or of rJJ'-^ppo » pArt 
mliNA In moose Intestine; Q-*rtilLa*l v * »nalysls by 
COJUJMitlllvc I'CK JiUKhini. htoplry*. Atttt 1215; 157 Ki2. 

moitiiu, P.S. Hybrid ir.n Hon ol ikf»aiurcd UNA and 

small UNA fiafirnciit* tr«v»afcrrcd nllrowlluUfe. 
Nrtf/. Acr«/. S<t. 77; 520 1-5205. 

Williams, S., C. Sthwcr, A. Xrlshuarao, C. Held, H. 
KiUKW, and r.M. WUKaitm. 1006. Quantitulivo 
cornel itive l-CK; Aiialy^U of amplified products Of Ihc 

wr> &<™ by ^pi«»ry electrophoresis wllU laser 
Induced riuorwt-ncc d«««tlon. Arwf. Hiochm. (In press). 



K«:c(w»d /i/;ic ;), accepted in rsviscd form July 29, 
1996. 



. C . € 

Proc. Natl. Acad, Sci. USA 

Vol. 95, pp. 14717-14722, December 1998 

Cell Biology, Medical Sciences, 

WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt- 1 -transformed cells and 
aberrantly expressed in human colon tumors 

Diane Pennica*1\ Todd A. Swanson*, James W. Welsh*, Margaret A. Roy*, David A. Lawrence*, 
James Lee*, Jennifer Brush*, Lisa A. Taneyhill§, Bethanne Deuel*, Michael Lew\ Coun WatanabeII, 
Robert L. Cohen*, Mona F. Melhem**, Gene G. Finley**, Phil QuiRKEtt, Audrey D. Goddard*, 
Kenneth J. Hillan 11 , Austin L. Gurney*, David Botstein****, and Arnold J. Levine§ 

Departments of 'Molecular Oncology, *MoIecular Biology, •Scientific Computing, and 'Pathology, Genentech Inc., 1 DNA Way, South San Francisco, CA 94080; 
* "University of Pittsburgh School of Medicine, Veterans Administration Medical Center, Pittsburgh. PA 15240; ^University of Leeds, Leeds, LS29JT United 
Kingdom; **Dcpanrnent of Genetics, Stanford University, Palo Alto, CA 94305; and ^Department of Molecular Biology, Princeton University, Princeton, NJ 
08544 



Contributed by David Botstein and Arnold J. Levine, October 21, 1998 

ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP-1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-1 under the control of a 
tet racy line repressible promoter, and (U) Wnt-1 transgenic 
mice. The WISP-1 gene was localized to human chromosome 
8q24.1-8q24 j. WISP-1 genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
overexpressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISPS 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40- fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2* to > 30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine- rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-33 (GSK-3j3) resulting in an increase in 
j3-catenin levels. Stabilized 0-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears in 
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the nucleus and- binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
/3-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to /3-catenin, and facilitates its degradation. Mutations in 
either APC or /3-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal -related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-/3 superfamily, and the 
homeobox genes, engrailed ,goosecoid, twin [Xtwn), and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and refractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and WISP-2, and a third related gene, WISP-3. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
. Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AF100777, 
AF100778, AF100779, AF100780, and AF100781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
com. 
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cDNA was synthesized from 2 u.g of poly(A) + RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 jig 
of poly(A) + RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WlSP-1 were isolated by screening a AgtlO mouse 
embryo cDN A library (CLONTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
. coding full-length mouse and human WlSP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISP-3 were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PGR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 jxM of each dNTP at 
94°C for 1 sec, 62°C for 30 sec, 72°C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In Situ Hybridization. 33 P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-1 or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WlSP-2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2<a«) w here ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
3-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The Jf/Si'-specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-1 and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 
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mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt-1 cells. 

Two of the cDNAs, WISP-1 and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on /3-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-1 were isolated and the 
sequence compared with mouse WISP-1. The cDNA sequences 
of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of - 40,000 (M T 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 2A). 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of — 27,000 (Af r 27 K) (Fig. IB). Mouse and human 
WISP-2 are 73% identical. Human WISP-2 has no potential 
N-Iinked glycosylation sites, and mouse WISP-2 has one at 



CS7MQ 


Parent 


Wni-1 


WrrM 


A. 








s. 







Fig. 1. WISP-1 and WISP-2 are induced by Wnt-1, but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP-1 (A) and 
WISP-2 {B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Poly(A) + RNA (2 /xg) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse WISP- 1- specific probe 
(amino acids 278-300) or a 190-bp W7S/>-2-specific probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human j3-actin probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WISP-1 (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-1 . 

Identification of WISPS. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP-3 cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-1 and WISP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-1 and 32% identity with WISP-3 (Fig. 34). 

WISPs Are Homologous to the CTGF Family of Proteins. 
Human WISP-1, WISP-2, and WISP-3 are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP-2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov. CTGF is a chemotactic and mitogen ic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-/3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 3B) (21). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fig. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WISP-2 that are not 
present in WISP-3 are indicated with a dot. (£) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PCR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
Iebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only six of the 10 cysteine residues (Fig. 3 A and B). 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISPS 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-1 and WISP-2. Expression of 
WISP'l and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-1, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 




Fig. 4. (/I, C, £, and G) Representative hematoxylin/ eos in -stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power {A and £), 
expression of WISP-1 is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and 0), and tumor cells are negative. 
Focal expression of WISP-1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in E-H. At low 
power (£ and F), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 



the predominant cell type expressing WISP-1 was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc f we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PCR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PCR (Fig. 6). The copy number of WISP-1 and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors (P < 0.001 for each). The 
copy number for WISPS was indistinguishable from one (P = 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 (P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 




Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was determined by quanti- 
tative PCR. (B) Southern blots containing genomic DNA (10 jig) 
digested with EcoRl (WISP-1) or Xbal (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Tumor Number 

Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1 ', WISP-3 RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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Fig, 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR. The Dukes stage of the tumor is listed under the 
sample number. The data are means :t SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 
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mucosa. The amount of overexpression of WISP-3 ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-1, WISP-2, and WISP-3 t 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing WnM under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., 0-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through 0-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such asTGF-/3, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin a v fo serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascuiar tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
- tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-pl, which is the stimulus for 
stromal proliferation (34). TGF-01 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
WISP-3 RN A was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP-2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-I, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and 0-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic j3-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation. of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. . 
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methods. Peptides AENK or AEQK were dissolved in water, made isotonic with 
NaCl and diluted into RPM1 growth medium. T-cell-proliferation assays were 
done essentially as described 2 "'. Briefly, after antigen pulsing UOu-gm]" 1 
TTCF) with tetrapeptides (l-2mgmr') f PBMCs or EBV-B cells were 
washed in PBS and fixed for 45 s in 0.05% glutaraJdehyde. Glycine was added 
to a final concentration of 0.1 M and the cells were washed five times in RPMI 
1640 medium containing 1% PCS before co -culture with T-cell clones in 
round-bottom 96-well microtitre plates. After 48 h, the cultures were pulsed 
with 1 u,Ci of 3 H-thymidine and harvested for scintillation counting 16 h later. 
Predigestion of native TTCF was done by incubating 200 u,g TTCF with 0.25 u.g 
pig kidney legumain in 500 \x.\ 50 mM citrate buffer, pH 5.5, for 1 h at 37 °C. 
Glycopeptide digestions. The peptides HIDNEEDI, HIDN(JV- glucosamine) 
EEDI and HIDNESDI, which are based on the TTCF sequence, and 
QQQHLFGSNVTDCSGNFCLFR(KKK), which is based on human transferrin, 
were obtained by custom synthesis. The three C-terminal lysine residues were 
added to the natural sequence to aid solubility. The transferrin glycopeptide 
QQQHLFGSNVTDCSGNFCLFR was prepared by tryp tic (Pro mega) digestion 
of 5mg reduced, carboxy-methylated human transferrin followed by 
concanavalin A chromatography 11 . Glycopeptides corresponding to residues 
622-642 and 421-452 were isolated by reverse-phase HPLC and identified by 
mass spectrometry and N- terminal sequencing. The lyophilized transferrin - 
derived peptides were redissolved in 50 mM sodium acetate, pH 5.5, 10 mM 
dithiothreitol, 20% methanol. Digestions were performed for 3 h at 30 °C with 
5-50 mU ml" 1 pig kidney legumain or B-cell AEP. Products were analysed by 
HPLC or MALDI-TOF mass spectrometry using a matrix of lOmgmr 1 a- 
cyanocinnamic acid in 50% acetonitrile/0,1% TFA and a PerSeptive Biosystems 
Elite STR mass spectrometer set to linear or reflector mode. Internal standar- 
dization was obtained with a matrix ion of 568.13 mass units. 
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Fas ligand (FasL) is produced by activated T cells and natural 
killer cells and it induces apoptosis (programmed cell death) in 
target cells through the death receptor Fas/Apol/CD95 (ref. 1). 
One important role of FasL and Fas is to mediate immune- 
cytotoxic killing of cells that are potentially harmful to the 
organism, such as virus-infected or tumour cells 1 . Here we 
report the discovery of a soluble decoy receptor, termed decoy 
receptor 3 (DcR3), that binds to FasL and inhibits FasL-induced 
apoptosis. The DcR3 gene was amplified in about half of 35 
primary lung and colon tumours studied, and DcR3 messenger 
RNA was expressed in malignant tissue. Thus, certain tumours 
may escape FasL-dependent immune-cytotoxic attack by expres- 
sing a decoy receptor that blocks FasL. 

By searching expressed sequence tag (EST) databases, we identi- 
fied a set of related ESTs that showed homology to the tumour 
necrosis factor (TNF) receptor (TNFR) gene superfamily 2 . Using 
the overlapping sequence, we isolated a previously unknown full- 
length complementary DNA from human fetal lung. We named the 
protein encoded by this cDNA decoy receptor 3 (DcR3). The cDNA 
encodes a 300-amino-acid polypeptide that resembles members of 
the TNFR family (Fig. la): the amino terminus contains a leader 
sequence, which is followed by four tandem cysteine- rich domains 
(CRDs). Like one other TNFR homologue, osteoprotegerin(OPG) 3 , 
DcR3 lacks an apparent transmembrane sequence, which indicates 
that it may be a secreted, rather than a membrane-asscociated, 
molecule. We expressed a recombinant, histidine- tagged form of 
DcR3 in mammalian cells; DcR3 was secreted into the cell culture 
medium, and migrated on polyacrylamide gels as a protein of 
relative molecular mass 35,000 (data not shown). DcR3 shares 
sequence identity in particular with OPG (31%) and TNFR2 
(29%), and has relatively less homology with Fas (17%). All of 
the cysteines in the four CRDs of DcR3 and OPG are conserved; 
however, the carboxy- terminal portion of DcR3 is 101 residues 
shorter. 

We analysed expression of DcR3 mRNA in human tissues by 
northern blotting (Fig. lb). We detected a predominant 1.2-kilobase 
transcript in fetal lung, brain, and liver, and in adult spleen, colon 
and lung. In addition, we observed relatively high DcR3 mRNA 
expression in the human colon carcinoma cell line SW480. 

To investigate potential ligand interactions of DcR3, we generated 
a recombinant, Fc-tagged DcR3 protein. We tested binding of 
DcR3-Fc to human 293 cells transfected with individual TNF- 
family ligands, which are expressed as type 2 transmembrane 
proteins (these transmembrane proteins have their N termini in 
the cytosol). DcR3-Fc showed a significant increase in binding to 
cells transfected with FasL 4 (Fig. 2a), but not to cells transfected with 
TNF 5 , Apo2L/TRAIL 6 ' 7 , Apo3L/TWEAK M , or OPGL/TRANCE/ 
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RANKL 10 " 11 (data not shown). DcR3-Fc immuno precipitated shed 
FasL from FasL-transfected 293 cells (Fig. 2b) and purified soluble 
FasL (Fig. 2c), as did the Fc-tagged ectodomain of Fas but not 
TNFR1. Gel- filtration chromatography showed that DcR3-Fc and 
soluble FasL formed a stable complex (Fig. 2d). Equilibrium 
analysis indicated that DcR3-Fc and Fas-Fc bound to soluble 
FasL with a comparable affinity (AT d = 0.8 i 0.2 and 
l.l±0.InM, respectively; Fig. 2e), and that DcR3-Fc could 
block nearly all of the binding of soluble FasL to Fas-Fc (Fig. 2e, 
inset). Thus, DcR3 competes with Fas for binding to FasL. 

To determine whether binding of DcR3 inhibits FasL activity, we 
tested the effect of DcR3-Fc on apoptosis induction by soluble 
FasL in Jurkat T leukaemia cells, which express Fas (Fig. 3a). DcR3- 
Fc and Fas-Fc blocked soluble- FasL-induced apoptosis in a 
similar dose- dependent manner, with half-maximal inhibition at 
—0.1 u.gmr 1 . Time-course analysis showed that the inhibition did 
not merely delay cell death, but rather persisted for at least 24 hours 
(Fig. 3b). We also tested the effect of DcR3-Fc on activation-' 
induced cell death (AICD) of mature T lymphocytes, a FasL- 
dependent process 1 . Consistent with previous results 13 , activation 
of interleukin -2 -stimulated CD4-positive T cells with anti-CD3 
antibody increased the level of apoptosis twofold, and Fas-Fc 
blocked this effect substantially (Fig. 3c); DcR3-Fc blocked the 



induction of apoptosis to a similar extent. Thus, DcR3 binding 
blocks apoptosis induction by FasL. 

FasL-induced apoptosis is important in elimination of virus- 
infected cells and cancer cells by natural killer cells and cytotoxic T 
lymphocytes; an alternative mechanism involves perforin and 
granzymes M4 "'\ Peripheral blood natural killer cells triggered 
marked cell death in Jurkat T leukaemia cells (Fig. 3d); DcR3-Fc 
and Fas-Fc each reduced killing of target cells from —65% to 
—30%, with half-maximal inhibition at — l(xgml _I ; the residual 
killing was probably mediated by the perforin/granzyme pathway. 
Thus, DcR3 binding blocks FasL-dependent natural killer cell 
activity. Higher DcR3-Fc and Fas-Fc concentrations were required 
to block natural killer cell activity compared with those required to 
block soluble FasL activity, which is consistent with the greater 
potency of membrane-associated FasL compared with soluble 
FasL 17 . 

Given the role of immune-cyto toxic cells in elimination of 
tumour cells and the fact that DcR3 can act as an inhibitor of 
FasL, we proposed that DcR3 expression might contribute to the 
ability of some tumours to escape immune-cytotoxic attack. As 
genomic amplification frequently contributes to tumorigenesis, we 
investigated whether the DcR3 gene is amplified in cancer. We 
analysed DcR3 gene-copy number by quantitative polymerase chain 
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Figure 1 Primary structure and expression of human DcR3. a, Alignment of the 
amino-acid sequences of DcR3 and of osteoprotegerin (OPG); the C-terminal 101 
residues of OPG are not shown. The putative signal cleavage site (arrow), the 
cysteine-rich domains (CRD 1 -4), and the A/-linked glycosylation site (asterisk) are 
shown, b, Expression of DcR3 mRNA. Northern hybridization analysis was done 
using the DcR3 cDNA as a probe and blots of poly(A)* RNA (Clontech) from 
human fetal and adult tissues or cancer cell lines. PBL, peripheral blood 
lymphocyte. 



Figure 2 interaction of DcR3 with FasL. a, 293 cells were transfected with pRK5 
vector (top) or with pRK5 encoding full-length FasL (bottom), incubated with 
DcR3-Fc (solid line, shaded area), TNFR1-Fc (dotted line) or buffer control 
(dashed line) (the dashed and dotted lines overlap), and analysed for binding by 
FACS. Statistical analysis showed a significant difference {P < 0.001 } between the 
binding of DcR3-Fc to cells transfected with FasL or pRK5. PE, phycoerythrin- 
labelled cells, b, 293 cells were transfected as in a and metabolically labelled, and 
cell supernatants were immunoprecipitated with Fc-tagged TNFR1. DcR3 or Fas. 
c, Purified soluble FasL (sFasL) was immunoprecipitated with TNFR1 -Fc, DcR3- 
Fc or Fas-Fc and visualized by immunoblot with anti-FasL antibody. sFasL was 
loaded directly for comparison in the right-hand lane, d, Flag-tagged sFasL was 
incubated with DcR3-Fc or with buffer and resolved by gel filtration; column 
fractions were analysed in an assay that detects complexes containing DcR3-Fc 
and sFasL-Flag. e, Equilibrium binding of DcR3-Fc or Fas-Fc to sFasL-Flag. 
Inset, competition of DcR3-Fc with Fas-Fc for binding to sFasL-Flag. 
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reaction (PCR)' 9 in genomic DNA from 35 primary lung and colon 
tumours, relative to pooled genomic DNA from peripheral blood 
leukocytes (PBLs) of 10 healthy donors. Eight of 18 lung tumours 
and 9 of 17 colon tumours showed DcR3 gene amplification, 
ranging from 2- to 18-fold (Fig. 4a, b). To confirm this result, we 
analysed the colon tumour DNAs with three more, independent sets 
of DcR3 -based PCR primers and probes; we observed nearly the 
same amplification (data not shown). 

We then analysed DcR3 mRNA expression in primary tumour 
tissue sections by in situ hybridization. We detected DcR3 expres- 
sion in 6 out of 15 lung tumours, 2 out of 2 colon tumours, 2 out of 5 
breast tumours, and 1 out of I gastric tumour (data not shown). A 
section through a squamous-cell carcinoma of the lung is shown in 
Fig. 4c. DcR3 mRNA was localized to infiltrating malignant epithe- 
lium, but was essentially absent from adjacent stroma, indicating 
tumour-specific expression. Although the individual tumour speci- 
mens that we analysed for mRNA expression and gene amplification 
were different, the in situ hybridization results are consistent with 
the finding that the DcR3 gene is amplified frequently in tumours. 
SW480 colon carcinoma cells, which showed abundant DcR3 
mRNA expression (Fig. lb), also had marked DcR3 gene amplifica- 
tion, as shown by quantitative PCR (fourfold) and by Southern blot 
hybridization (fivefold) (data not shown). 

If DcR3 amplification in cancer is functionally relevant, then 
DcR3 should be amplified more than neighbouring genomic 
regions that are not important for tumour, survival. To test this, 



we mapped the human DcR3 gene by radiation-hybrid analysis; 
DcR3 showed linkage to marker AFM2 18xe7 (T160), which maps to 
chromosome position 20ql3. Next, we isolated from a bacterial 
artificial chromosome (BAC) library a human genomic clone that 
carries DcR3, and sequenced the ends of the clone's insert. We then 
determined, from the nine colon tumours that showed twofold or 
greater amplification of DcR3, the copy number of the DcR3- 
flanking sequences (reverse and forward) from the BAC, and of 
seven genomic markers that span chromosome 20 (Fig. 4d). The 
DcR3 -linked reverse marker showed an average amplification of 
roughly threefold, slightly less than the approximately fourfold 
amplification of DcR3; the other markers showed little or no 
amplification. These data indicate that DcR3 may be at the 'epi- 
centre* of a distal chromosome 20 region that is amplified in colon 
cancer, consistent with the possibility that DcR3 amplification 
promotes tumour survival. 

Our results show that DcR3 binds specifically to FasL and inhibits 
FasL activity. We did not detect DcR3 binding to several other TNF- 
ligand- family members; however, this does not rule out the possi- 
bility that DcR3 interacts with other ligands, as do some other 
TNFR family members, including OPG 2 * 19 . . 

FasL is important in regulating the immune response; however, 
little is known about how FasL function is controlled. One mechan- 
ism involves the molecule cFLIP, which modulates apoptosis signal- 
ling downstream of Fas 20 . A second mechanism involves proteolytic 
shedding of FasL from the cell surface 17 . DcR3 competes with Fas for 
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Figure 3 Inhibition of FasL activity by DcR3. a, Human Jurkat T leukaemia cells 
were incubated with Flag-tagged soluble FasL (sFasL;. 5ngml"') oligomerized 
with anti-Flag antibody (0.1 u.gmr') in the presence of the proposed inhibitors 
DcR3-Fc. Fas-Fc or human IgGi arid assayed for apoptosis (mean ± s.e.m. of 
triplicates), b, Jurkat cells were incubated with sFasL-Flag.plus anti-Flag antibody 
as in a, in presence of t u.g ml"' DcR3-Fc (filled circles), Fas-Fc (open circles) or 
human IgG! (triangles), and apoptosis was determined at the indicated time 
points, c, Peripheral blood T cells were stimulated with PHA and interieukin-2, 
followed by control (white bars) or anti-CD3 antibody (filled bars), together with 
phosphate-buffered saline (PBS), human IgGi, Fas-Fc, or DcR3-Fc (10 m-9 mr'). 
After 16 h t apoptosis of CD4* cells was determined (mean ± s.e.m. of results from 
rive donors), d, Peripheral blood natural killer cells were incubated with s, Cr- 
labelled Jurkat cells in the presence of DcR3-Fc (filled circles), Fas-Fc (open 
circles) or human IgGi (triangles), and target-cell death was determined by 
release of 6, Cr (mean z s.d. for two donors, each in triplicate). 
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Figure 4 Genomic amplification of DcR3 in tumours, a, Lung cancers, comprising 
eight adenocarcinomas (c, d, f, g, h, j, k, r), seven squamous-cell carcinomas (a, e, 
m, n, o, p, q), one non-srnall-cell carcinoma (b). one small-cell carcinoma (i), and 
one bronchial adenocarcinoma (I). The data are means * s.d. of 2 experiments 
done in duplicate, b, Colon tumours, comprising 17 adenocarcinomas. Data are 
means * s.e.m. of five experiments done in duplicate, c. In situ hybridization . 
analysis of DcR3 mRNA expression in a squamous-cell carcinoma of the lung. A 
representative bright-field image (left) and the corresponding dark-held image 
(right) show DcR3 mRNA over infiltrating malignant epithelium (arrowheads). 
Adjacent non-malignant stroma (S), blood vessel (V) and necrotic tumour tissue 
(N) are also shown, d, Average amplification of DcR3 compared with amplifica- 
tion of neighbouring genomic regions (reverse and forward, Rev and Fwd), the 
DcR3-linked marker T160. and other chromosome-20 markers, in the nine colon 
tumours showing DcR3 amplification of twofold or more (b). Data are from two 
experiments done in duplicate. Asterisk indicates P < 0.01 for a Student's Mest 
comparing each marker with DcR3. 
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FasL binding; hence, it may represent a third mechanism of 
extracellular regulation of FasL activity. A decoy receptor that 
modulates the function of the cytokine interleukin-1 has been 
described 21 . In addition, two decoy receptors that belong to the 
TNFR family, DcRl and DcR2, regulate the FasL- related apoptosis- 
inducing molecule Apo2L 22 . Unlike DcRl and DcR2, which are 
membrane- associated proteins, DcR3 is directly secreted into the 
extracellular space. One other secreted TNFR- family member is 
OPG\ which shares greater sequence homology with DcR3 (31%) 
than do DcRl (17%) or DcR2 (19%); OPG functions as a third 
decoy for Apo2L' 9 . Thus, DcR3 and OPG define a new subset of 
TNFR- family members that function as secreted decoys to mod- 
ulate Iigands that induce apoptosis. Pox viruses produce soluble 
TNFR homologues that neutralize specific TNF-family Iigands, 
thereby modulating the antiviral immune response 2 . Our results 
indicate that a similar mechanism, namely, production of a soluble 
decoy receptor for FasL, may contribute to immune evasion by 
certain tumours. □ 



Methods 

Isolation of DcR3 cDNA. Several overlapping ESTs in GenBank (accession 
numbers AA025672, AA025673 and W67560) and in Lifeseq™ (Incyte 
Pharmaceuticals; accession numbers 1339238, 1533571, 1533650, 1542861, 
1789372 and 2207027) showed similarity to members of the TNFR family. We 
screened human cDNA libraries by PCR with primers based on the region of 
EST consensus; fetal lung was positive for a product of the expected size. By 
hybridization to a PCR-generated probe based on the ESTs, one positive clone 
(DNA30942) was identified. When searching for potential alternatively spliced 
forms of DcR3 that might encode a transmembrane protein, we isolated 50 
more clones; the coding regions of these clones were identical in size to that of 
the initial clone (data not shown). 

Fc-fusion proteins (immunoadhesins). The entire DcR3 sequence, or the 
ectodomain of Fas or TNFR1, was fused to the hinge and Fc region of human 
IgGl, expressed in insect SF9 cells or in human 293 cells, and purified as 
described". 

Fluorescence-activated cell sorting (FACS) analysis. We transfected 293 
cells using calcium phosphate or EfFectene (Qiagen) with pRK5 vector or pRK5 
encoding full-length human FasL 4 (2 u.g), together with pRK5 encoding CrmA 
(2 jig) to prevent cell death. After 16 h, the cells were incubated with 
biotinylated DcR3-Fc or TNFRl-Fc and then with phycoerythrin -conjugated 
streptavidin (GibcoBRL), and were assayed by FACS. The data were analysed by 
Kolmogorov-Smirnov statistical analysis. There was some detectable staining 
of vector-transfected cells by DcR3-Fc; as these cells express little FasL (data 
not shown), it is possible that DcR3 recognized some other factor that is 
expressed constitutively on 293 cells. 

Immunoprecipitation. Human 293 cells were transfected as above, and 
metabolically labelled with ( 35 S]cysteine and [ 35 S] methionine (0.5 mCi; 
Amersham). After 16 h of culture in the presence of z-VAD-fmk (10u.M), 
the medium was immunoprecipitated with DcR3-Fc, Fas-Fc or TNFRl-Fc 
(5 u.g), followed by protein A-Sepharose (Repligen). The precipitates were 
resolved by SDS-PAGE and visualized on a phosphorimager (Fuji BAS2000). 
Alternatively, purified, Flag-tagged soluble FasL (1 u>g) (Alexis) was incubated 
with each Fc-fusion protein (1 jxg), precipitated with protein A-Sepharose, 
resolved by SDS-PAGE and visualized by immunoblotting with rabbit anti- 
FasL antibody (Oncogene Research). 

Analysis of complex formation. Flag-tagged soluble FasL (25 ng) was 
incubated with buffer or with DcR3-Fc (40 p-g) for 1.5 h at 24 °C. The reaction 
was loaded onto a Superdex 200 HR 10/30 column (Pharmacia) and developed 
with PBS; 0.6-ml fractions were collected. The presence of DcR3-Fc-FasL 
complex in each fraction was analysed by placing 100 u.1 aliquots into microtitre 
wells precoated with anti-human lgG (Boehringer) to capture DcR3-Fc, 
followed by detection with biotinylated anti-Flag antibody Bio M2 (Kodak) and 
streptavidin-horseradish peroxidase (Amersham). Calibration of the column 
indicated an apparent relative molecular mass of the complex of 420K (data not 
shown), which is consistent with a stoichiometry of two DcR3-Fc homodimers 
to two soluble FasL homotrimers. 

Equilibrium binding analysis. Microtitre wells were coated with anti-human 



IgG, blocked with 2% BSA in PBS. DcR3-Fc or Fas-Fc was added, followed by 
serially diluted Flag-tagged soluble FasL. Bound ligand was detected with anti- 
Flag antibody as above. In the competition assay, Fas-Fc was immobilized as 
above, and the wells were blocked with excess IgGl before addition of Flag- 
tagged soluble FasL plus DcR3-Fc. 

T-cell AICO. CD3 + lymphocytes were isolated from peripheral blood of 
individual donors using anti-CD3 magnetic beads (Miltenyi Biotech), 
stimulated with phytohaemagglutinin (PHA; 2 u,g ml" 1 ) for 24 h, and cultured 
in the presence of interleukin-2 ( 100 U ml -1 ) for 5 days. The cells were plated in 
wells coated with anti-CD3 antibody (Pharmingen) and analysed for apoptosis 
16 h laterby FACS analysis of annexin-V-binding of CD4 + cells 24 . 
Natural killer cell activity. Natural killer cells were isolated from peripheral 
blood of individual donors using anti-CD56 magnetic beads (Miltenyi 
Biotech), and incubated for 16 h with SI Cr-Ioaded Jurkat cells at an effector- 
to-target ratio of 1:1 in the presence of DcR3-Fc, Fas-Fc or human IgGl. 
Target -cell death was determined by release of 51 Cr in effector- target co- 
cultures relative to release of 51 Cr by detergent lysis of equal numbers of Jurkat 
cells. 

Gene-amplification analysis. Surgical specimens were provided by J. Kern 
(lung tumours) and P. Quirke (colon tumours). Genomic DNA was extracted 
(Qiagen) and the concentration was determined using Hoechst dye 33258 
intercalation fluorometry. Amplification was determined by quantitative PCR" 
usingaTaqMan instrument (ABI).The method was validated by comparison of 
PCR and Southern hybridization data for the Myc and HER- 2 oncogenes (data 
not shown). Gene- specific primers and fluorogenic probes were designed on 
the basis of the sequence of DcR3 or of nearby regions identified on a BAC 
carrying the human DcR3 gene; alternatively, primers and probes were based 
on Stanford Human Genome Center marker AFM2I8xe7 (T160), which is 
linked to DcR3 (likelihood score = 5.4), SHGC-36268 (T159), the nearest 
available marker which maps to ^500 kilobases from T160, and five extra 
markers that span chromosome 20. The DcR3-specific primer sequences were 
5'-CTTCTTCGCGCACGCTG-3' and 5'-ATCACGCCGGCACCAG-3' and the 
fluorogenic probe sequence was 5'-(FAM -ACACGATGCGTGCTCCAAGCAG 
AAp-(TAMARA), where FAM is 5' -fluorescein phosphoramidite. Relative 
gene- copy numbers were derived using the formula 2 UCT \ where ACT is the 
difference in amplification cycles required to detect DcR3 in peripheral blood 
lymphocyte DNA compared to test DNA. 
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ABC transporters (also known as traffic ATPases) form a large 
family of proteins responsible for the translocation of a variety 
of compounds across membranes of both prokaryotes and 
eukaryotes 1 . The recently completed Escherichia coli genome 
sequence revealed that the largest family of paralogous E. coli 
proteins is composed of ABC transporters 2 . Many eukaryotic 
proteins of medical significance belong to this family, such as 
the cystic fibrosis transmembrane conductance regulator (CFTR), 
the P-glycoprotein (or multidrug- resistance protein) and the 
heterodimeric transporter associated With antigen processing 
(Tapl-Tap2). Here we report the crystal structure at 1.5 A resolu- 
tion of HisP, the ATP-binding subunit of the histidine permease, - 
which is an ABC transporter from Salmonella typhimurium. We 
correlate the details of this structure with the biochemical, genetic 
and biophysical properties of the wild-type and several mutant 
HisP proteins. The structure provides a basis for understanding 
properties of ABC transporters and of defective CFTR proteins. 

ABC transporters contain four structural domains: two nucleo- 
tide-binding domains (NBDs), which are highly conserved 
throughout the family, and two transmembrane domains 1 . In 
prokaryotes these domains are often separate subunits which are 
assembled into a membrane -bound complex; in eukaryotes the 
domains are generally fused into a single polypeptide chain. The 
periplasmic histidine permease of S. typhimurium and E. coli 1 J " 8 is a 
well-characterized ABC transporter that is a good model for this 
superfamily. It consists of a membrane-bound complex, HisQMP 2 , 
which comprises integral membrane subunits, HisQ and HisM, and 
two copies of HisP, the ATP-binding subunit. HisP, which has 
properties intermediate between those of integral and peripheral 
membrane proteins 9 , is accessible from both sides of the membrane, 
presumably by its interaction with HisQ and HisM 6 . The two HisP 
subunits form a dimer, as shown by their cooperativity in ATP 
hydrolysis 3 , the requirement for both subunits to be present for 
activity 8 , and the formation of a HisP dimer upon chemical cross- 
linking. Soluble HisP also forms a dimer 3 . HisP has been purified 
and characterized in an active soluble form 3 which can be recon- 
stituted into a fully active membrane-bound complex 8 . 

The overall shape of the crystal structure of the HisP monomer is 
that of an T with two thick arms {arm I and arm II); the ATP- 
binding pocket is near the end of arm I (Fig. 1). A six-stranded p- 
sheet ( p3 and (38-0 12) spans both arms of the L, with a domain of a 
a- plus P-type structure (01, £2, £4-07, ctl and ct2) on one side 
(within arm I) and a domain of mostly a-helices (a3-a9) on the 
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Figure 1 Crystal structure of HisP. a, View of the dimer along an axis 
perpendicular to its two-fold axis. The top and bottom of the dimer are suggested 
to face towards the periplasmic and cytoplasmic sides, respectively (see text). 
Trie thickness of arm II is about 25 A, comparable to that of membrane. a-Helices 
are shown in orange and p-sheets in green, b, View along the two-fold axis of the 
HisP dimer, showing the relative displacement of the monomers not apparent in 
a. The (3-strands at the dimer interface are labelled, c, View of one monomer from 
the bottom of arm I, as shown in a, towards arm II, showing the ATP-binding 
pocket, a-c, The protein and the bound ATP are in "ribbon" and 'ball-and-stick' 
representations, respectively. Key residues discussed in the text are indicated in 
c. These figures were prepared with MOLSCRIPT 29 . N, amino terminus: C, C 
terminus. 



NATURE | VOL 396 1 17 DECEMBER 1998) www.nature.com 



Nature © Macmillan Publishers Ltd 1998 



703 



Int. J. Cancer: 78, 661-666 (1998) 
© 1998 Wiley-Liss, Inc. 



Publication of the International Union Against Cancer 
Publication de I'Union Internationale Centre le Cancer 
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Gene amplification is a common event in the progression of 
human cancers, and amplified oncogenes have been shown to 
have diagnostic, prognostic and therapeutic relevance. A 
kinetic quantitative polymerase-chain-reaction (PCR) method, 
based on fluorescent TaqMan methodology and a new instru- 
ment (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real-time, was used to quantify 
gene amplification in tumor DNA. Reactions are character- 
ized by the point during cycling when PCR amplification is still 
in the exponential phase, rather than the amount of PCR 
product accumulated after a fixed number of cycles. None of 
the reaction components is limited during the exponential 
phase, meaning that values are highly reproducible in reac- 
tions starting with the same copy number. This greatly 
improves the precision of DNA quantification. Moreover, 
real-time PCR does not require post-PCR sample handling, 
thereby preventing potential PCR-product carry-over con- 
tamination; it possesses a wide dynamic range of quantifica- 
tion and results in much faster and higher sample throughput. 
The real-time PCR method, was used to develop and validate 
a simple and rapid assay for the detection and quantification 
of the 3 most frequently amplified genes (myc, ccndl and 
erbB2) in breast tumors. Extra copies of myc, ccndl and erbB2 
were observed in 10, 23 and 15%, respectively, of 108 breast- 
tumor DNA; the largest observed numbers of gene copies 
were 4.6, 18.6 and 15.1, respectively. These results correlated 
well with those of Southern blotting. The use of this new 
semi-automated technique will make molecular analysis of 
human cancers simpler and more reliable, and should find 
broad applications in clinical and research settings. Int. J. 
Cancer 78:661 -666, 1998. 
© 1998 Wiley-Liss, Inc. 

Gene amplification plays an important role in the pathogenesis 
of various solid tumors, including breast cancer, probably because 
over-expression of the amplified target genes confers a selective 
advantage. The first technique used to detect genomic amplification 
was cytogenetic analysis. Amplification of several chromosome 
regions, visualized either as extrachromosomal double minutes 
(dmins) or as integrated homogeneously staining regions (HSRs), 
are among the main visible cytogenetic abnormalities in breast 
tumors. Other techniques such as comparative genomic hybridiza- 
tion (CGH) (Kallioniemi etai, 1994) have also been used in broad 
searches for regions of increased DNA copy numbers in tumor 
cells, and have revealed some 20 amplified chromosome regions in 
breast tumors. Positional cloning efforts are underway to identify 
the critical gene(s) in each amplified region. To date, genes known 
to be amplified frequently in breast cancers include myc (8q24), 
ccnd\ (1 lq!3), and erbB2 (1 7ql2-q21) (for review, see Bieche and 
Lidereau, 1995). 

Amplification of the myc, ccndl, and erb&2 proto-oncogenes 
should have clinical relevance in breast cancer, since independent 
studies have shown that these alterations can be used to identify 
sub-populations with a worse prognosis (Bems et ai, 1992; 
Schuuring et ai, 1992; Stamon et aL 1987). Muss et ai (1994) 
suggested that these gene alterations may also be useful for the 
prediction and assessment of the efficacy of adjuvant chemotherapy 
and hormone therapy. 

However, published results diverge both in terms of the fre- 
quency of these alterations and their clinical value. For instance, 
over 500 studies in 10 years have failed to resolve the controversy 



surrounding the link suggested by Slamon et al (1987) between 
erbB2 amplification and disease progression. These discrepancies 
are partly due to the clinical, histological and ethnic heterogeneity 
of breast cancer, but technical considerations are also probably 
involved. 

Specific genes (DNA) were initially quantified in tumor cells by 
means of blotting procedures such as Southern and slot blotting. 
These batch techniques require large amounts of DNA (5-10 
ug/reaction) to yield reliable quantitative results. Furthermore, 
meticulous care is required at all stages of the procedures to 
generate blots of sufficient quality for reliable dosage analysis. 
Recently, PCR has proven to be a powerful tool for quantitative 
DNA analysis, especially with minimal starting quantities of tumor 
samples (small, early-stage tumors and formalin-fixed, paraffin- 
embedded tissues). 

Quantitative PCR can be performed by evaluating the amount of 
product either after a given number of cycles (end-point quantita- 
tive PCR) or after a varying number of cycles during the 
exponential phase (kinetic quantitative PCR). In the first case, an 
internal standard distinct from the target molecule is required to 
ascertain PCR efficiency. The method is relatively easy but implies 
generating, quantifying and storing an internal standard for each 
gene studied. Nevertheless, it is the most frequently applied 
method to date. 

One of the major advantages of the kinetic method is its rapidity 
in quantifying a new gene, since no internal standard is required (an 
external standard curve is sufficient). Moreover, the kinetic method 
has a wide dynamic range (at least 5 orders of magnitude), giving 
an accurate value for samples differing in their copy number. 
Unfortunately, the method is cumbersome and has therefore been 
rarely used. It involves aliquot sampling of each assay mix at 
regular intervals and quantifying, for each aliquot, the amplifica- 
tion product. Interest in the kinetic method has been stimulated by a 
novel approach using fluorescent TaqMan methodology and a new 
instrument (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real time (Gibson et aL, 1996; Heid et 
aL, 1996). The TaqMan reaction is based on the 5' nuclease assay 
first described by Holland et aL (1991). The latter uses the 5' 
nuclease activity of Taq polymerase to cleave a specific fluorogenic 
oligonucleotide probe during the extension phase of PCR. The 
approach uses dual-labeled fluorogenic hybridization probes (Lee 
et ai., 1993). One fluorescent dye, co-valently linked to the 5' end 
of the oligonucleotide, serves as a reporter [FAM (i.e., 6-carboxy- 
fluorescein)] and its emission spectrum is quenched by a second 
fluorescent dye, TAMRA (Le., 6-carboxy-tetramethyl-rhodamine) 
attached to the 3' end. During the extension phase of the PCR 
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cycle, the fluorescent hybridization probe is hydrolyzed by the 
5'-3' nucleolytic activity of DNA polymerase. Nuclease degrada- 
tion of the probe releases the quenching of FAM fluorescence 
emission, resulting in an increase in peak fluorescence emission. 
The fluorescence signal is normalized by dividing the emission 
intensity of the reporter dye (FAM) by the emission intensity of a 
reference dye (i.e., ROX, 6-carboxy-X-rhodamine) included in 
TaqMan buffer, to obtain a ratio defined as the Rn (normalized 
reporter) for a given reaction tube. The use of a sequence detector 
enables the fluorescence spectra of all 96 wells of the thermal 
cycler to be measured continuously during PCR amplification. 

The real-time PCR method offers several advantages over other 
current quantitative PCR methods (Celi et ai, 1994): (i) the 
probe-based homogeneous assay provides a real-time method for 
detecting only specific amplification products, since specific hybri- 
dation of both the primers and the probe is necessary to generate a 
signal; (ii) the Q (threshold cycle) value used for quantification is 
measured when PCR amplification is still in the log phase of PCR 
product accumulation. This is the main reason why Q is a more 
reliable measure of the starting copy number than are end-point 
measurements, in which a slight difference in a limiting component 
can have a drastic effect on the amount of product; (Hi) use of C, 
values gives a wider dynamic range (at least 5 orders of magni- 
tude), reducing the need for serial dilution; (iv) The real-time PCR 
method is run in a closed-tube system and requires no post-PCR 
sample handling, thus avoiding potential contamination; (v) the 
system is highly automated, since the instrument continuously 
measures fluorescence" in all 96 wells of the thermal cycler during 
PCR amplification and the corresponding software processes, and 
analyzes the fluorescence data; (vi) the assay is rapid, as results are 
available just one minute after thermal cycling is complete; (vii) the 
sample throughput of the method is high, since 96 reactions can be 
analyzed in 2 nr. 

Here, we applied this semi -automated procedure to determine 
the copy numbers of the 3 most frequently amplified genes in breast 
tumors (myc, ccndl and erbB2), as well as 2 genes (alb and app) 
located in a chromosome region in which no genetic changes have 
been observed in breast tumors. The results for 108 breast tumors 
were compared with previous Southern-blot data for the same 
samples. 

MATERIAL AND METHODS 
Tumor and blood samples 

Samples were obtained from 1 08 primary breast tumors removed 
surgically from patients at the Centre Rene Huguenin; none of the 
patients had undergone radiotherapy or chemotherapy. Immedi- 
ately after surgery, the tumor samples were placed in liquid 
nitrogen until extraction of high-molecular- weight DNA. Patients 
were included in this study if the tumor sample used for DNA 
preparation contained more than 60% of tumor cells (histological 
analysis). A blood sample was also taken from 18 of the same 
patients. 

DNA was extracted from tumor tissue and blood leukocytes 
according to standard methods. 

Real-time PCR 

Theoretical basis. Reactions are characterized by the point 
during cycling when amplification of the PCR product is first 
detected, rather than by the amount of PCR product accumulated 
after a fixed number of cycles. The higher the starting copy number 
of the genomic DNA target, the earlier a significant increase in 
fluorescence is observed. The parameter C t (threshold cycle) is 
defined as the fractional cycle number at which the fluorescence 
generated by cleavage of the probe passes a fixed threshold above 
baseline. The target gene copy number in unknown samples is 
quantified by measuring C, and by using a standard curve to 
determine the starting copy number. The precise amount of 
genomic DNA (based on optical density) and its quality (i.e., lack 



of extensive degradation) are both difficult to assess. We therefore 
also quantified a control gene (alb) mapping to chromosome region 
4qll-ql3, in which no genetic alterations have been found in 
breast-tumor DNA by means of CGH (Kallioniemi et ai, 1994). 

Thus, the ratio of the copy number of the target gene to the copy 
number of the alb gene normalizes the amount and quality of 
genomic DNA. The ratio defining the level of amplification is 
termed "N", and is determined as follows: 

copy number of target gene (app, myc, ccndl, erbB2) 
N — ™ ■■' - '—-»- ■ — " ■ - — — — t 

copy number of reference gene (alb) 

Primers, probes, reference human genomic DNA and PCR 
consumables. Primers and probes were chosen with the assistance 
of the computer programs Oligo 4.0 (National Biosciences, Ply- 
mouth, MN), EuGene (Daniben Systems, Cincinnati, OH) and Primer 
Express (Perkin-Elmer Applied Biosystems, Foster City, CA). 

Primers were purchased from DNAgency (Malvern, PA) and 
probes from Perkin-Elmer Applied Biosystems. 

Nucleotide sequences for the oligonucleotide hybridization 
probes and primers are available on request. 

The TaqMan PCR Core reagent kit, Micro Amp optical tubes, 
and Micro Amp caps were from Perkin-Elmer Applied Biosystems. 

Standard-curve construction. The kinetic method requires a 
standard curve. The latter was constructed with serial dilutions of 
specific PCR products, according to Piatak et al. (1993). In 
practice, each specific PCR product was obtained by amplifying 20 
rig of a standard human genomic DNA (Boehringer, Mannheim, 
Germany) with the same primer pairs as those used later for 
real-time quantitative PCR. The 5 PCR products were purified 
using MicroSpin S-400 HR columns (Pharmacia, Uppsala, Swe- 
den) electrophorezed through an acrylamide gel and stained with 
ethidium bromide to check their quality. The PCR products were 
then quantified spec trophoto metrically and pooled, and serially 
diluted 1 0-fold in mouse genomic DNA (Clontech, Palo Alto, CA) 
at a constant concentration of 2 ng/ul The standard curve used for 
real -time quantitative PCR was based on serial dilutions of the pool 
of PCR products ranging from 10~ 7 (10 5 copies of each gene) to 
10" 10 (10 2 copies). This series of diluted PCR products was 
aliquoted and stored at -80°C until use. 

The standard curve was validated by analyzing 2 known 
quantities of calibrator human genomic DNA (20 ng and 50 hg). 

PCR amplification. Amplification mixes (50 ul) contained the 
sample DNA (around 20 ng, around 660,0 copies of disomic genes), 
10X TaqMan buffer (5 ul), 200 uM dATP, dCTP, dGTP, and 400 
uM dUTP, 5 mM MgCl 2 , 1.25 units of AmpliTaq Gold, 0.5 units of 
AmpErase uracil N-glycosylase (UNG), 200 nM each primer and 
100 nM probe. The thermal cycling conditions comprised 2 min at 
50°C and 10 min at 95°C. Thermal cycling consisted of 40 cycles at 
95°C for 15 s and 65°C for 1 min. Each assay included: a standard 
curve (from 1 0 5 to 1 0 2 copies) in duplicate, a no-template control, 
20 ng and 50 ng of calibrator human genomic DNA (Boehringer) in 
triplicate, and about 20 ng of unknown genomic DNA in triplicate 
(26 samples can thus be analyzed on a 96- well mi crop late). All 
samples with a coefficient of variation (CV) higher than 10% were 
retested. 

All reactions were performed in the ABI Prism 7700 Sequence 
Detection System (Perkin-Elmer Applied Biosystems), which 
detects the signal from the fluorogenic probe during PCR. 

Equipment for real-time detection. The 7700 system has a 
built-in thermal cycler and a laser directed via fiber optical cables 
to each of the 96 sample wells. A charge-coupled-device (CDD) 
camera collects the emission from each sample and the data are 
analyzed automatically. The software accompanying the 7700 
system calculates C t and determines the starting copy number in the 
samples. 
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Determination of gene amplification. Gene amplification was 
calculated as described above. Only samples with an N value 
higher than 2 were considered to be amplified. 

RESULTS 

To validate the method, real-time PCR was performed on 
genomic DNA extracted from 108 primary breast tumors, and 18 
normal leukocyte DNA samples from some of the same patients. 
The target genes were the myc, ccndl and erb&2 proto-oncogenes, 
and the p-amyloid precursor protein gene (app\ which maps to a 
chromosome region (21q21.2) in which no genetic alterations have 
been found in breast tumors (Kallioniemi et al., 1994). The 
reference disomic gene was the albumin gene (alb, chromosome 
4qll-ql3). 



Validation of the standard curve and dynamic range 
of real-time PCR 

The standard curve was constructed from PCR products serially 
diluted in genomic mouse DNA at a constant concentration of 
2 ng/ul. It should be noted that the 5 primer pairs chosen to analyze 
the 5 target genes do not amplify genomic mouse DNA (datd not 
shown). Figure 1 shows the real-time PCR standard curve for the 
alb gene. The dynamic range was wide (at least 4 orders of 
magnitude), with samples containing as few as 10 2 copies or as 
many as 1 0 5 copies. 

Copy-number ratio of the 2 reference genes fapp and albj 

The app to alb copy-number ratio was determined in 1 8 normal 
leukocyte DNA samples and all 108 primary breast-tumor DNA 
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Figure 1 - Albumin (alb) gene dosage by real-time PCR. Top: Amplification plots for reactions with starting alb gene copy number ranging 
from 10 5 (A9), 10 4 (A7), 10 3 (A4) to I0 5 (A2) and a no-template control (Al). Cycle number is plotted vs. change in normalized reporter signal 
(ARn). For each reaction tube, the fluorescence signal of the reporter dye (FAM) is divided by the fluorescence signal of the passive reference dye 
(ROX), to obtain a ratio defined as the normalized reporter signal (Rn). ARn represents the normalized reporter signal (Rn) minus the baseline 
signal established in the first 15 PCR cycles. ARn increases during PCR as alb PCR product copy number increases until the reaction reaches a 
plateau. C t (threshold cycle) represents the fractional cycle number at which a significant increase in Rn above a baseline signal (horizontal black 
line) can first be detected. Two replicate plots were performed for each standard sample, but the data for only one are shown here. Bottom: 
Standard curve plotting log starting copy number vs. C, (threshold cycle). The black dots represent the data for standard samples plotted in 
duplicate and the red dots the data for unknown genomic DNA samples plotted in triplicate. The standard curve shows 4 orders of linear dynamic 
range. 
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samples. We selected these 2 genes because they are located in 2 
chromosome regions (app, 2 1 q2 1 .2; alb, 4qll-ql3) in which no 
obvious genetic changes (including gains or losses) have been 
observed in breast cancers (Kallioniemi etal, 1994). The ratio for 
the 18 normal leukocyte DNA samples fell between 0.7 and 1.3 
(mean 1.02 ± 0.21), and was similar for the 108 primary breast- 
tumor DNA samples (0.6 to 1.6, mean 1.06 ± 0.25), confirming 
that alb and app are appropriate reference disomic genes for 
breast-tumor DNA. The low range of the ratios also confirmed that 
the nucleotide sequences chosen for the primers and probes were 
not polymorphic, as mismatches of their primers or probes with the 
subject's DNA would have resulted in differential amplification. 

myc, ccndl and erbB2 gene dose in normal leukocyte DNA 

To determine the cut-off point for gene amplification in breast- 
cancer tissue, 18 normal leukocyte DNA samples were tested for 
the gene dose (N), calculated as described in " Material and 
Methods". The N value of these samples ranged from 0.5 to 1.3 
(mean 0.84 i 0.22) for mvc; 0.7 to 1.6 (mean 1.06 ± 0.23) for 
ccndl and 0.6 to 1.3 (mean0.91 ± 6. 19) for erbBl. Since N values 
for myc, ccndl and erbB2 in normal leukocyte DNA consistently 
fell between 0.5 and 1 .6, values of 2 or more were considered to 
represent gene amplification in tumor DNA. 

myc, ccndl and erbB2 gene dose in breast-tumor DNA 

myc, ccndl and erbBl gene copy numbers in the 1 08 primary 
breast tumors are reported in Table I. Extra copies of ccndl were 
more frequent (23%, 25/108) than extra copies of erbBl (15%, 
16/108) and myc (10%, 11/108), and ranged from 2 to 18.6 for 
ccndl, 2 to 15.1 for erbBl, and only 2 to 4.6 for the myc gene. 
Figure 2 and Table II represent tumors in which the ccndl gene was 
amplified 16-fold (T145), 6-fold (T133) and non-amplified (Til 8). 
The 3 genes were never found to be co-amplified in the same tumor. 
erbBl and ccndl were co-amplified in only 3 cases, myc and ccndl 
in 2 cases and myc and erbBl in 1 case. This favors the hypothesis 
that gene amplifications are independent events in breast cancer. 
Interestingly, 5 tumors showed a decrease of at least 50% in the 
erbBl copy number (N < 0.5), suggesting that they bore deletions 
of the 17q21 region (the site of erbBl). No such decrease in copy 
number was observed with the other 2 proto-oncogenes. 

. Comparison of gene dose determined by real-time quantitative 
PCR and Southern-blot analysis 

Southern-blot analysis of myc, ccndl and erbBl amplifications 
had previously been done on the same 1 08 primary breast tumors. A 
perfect correlation between the results of real-time PCR and 
Southern blot was obtained for tumors with high copy numbers 
(N ^ 5). However, there were cases (1 myc, 6 ccndl and 4 erbBl) 
in which real-time PCR showed gene amplification whereas 
Southern-blot did not, but these were mainly cases with low extra 
copy numbers (N from 2 to 2.9). 

DISCUSSION 

The clinical applications of gene amplification assays are 
currently limited, but would certainly increase if a simple, standard- 
ized and rapid method were perfected. Gene amplification status 
has been studied mainly by means of Southern blotting, but this 
method is not sensitive enough to detect low-level gene amplifica- 
tion nor accurate enough to quantify the full range of amplification 
values. Southern blotting is also time-consuming, uses radioactive 



TABLE 1 - DISTRIBUTION OF AMPLIFICATION LEVEL (N) FOR myc. 
ccndl AND erbBl GENES IN 108 HUMAN BREAST TUMORS 



Gene 




Amplification level (N) 




<0.5 


0.5-1.9 2-4.9 


as 


myc 


0 


97(89.8%) 11 (10.2%) 


0 


ccndl 


0 


83 (76.9%) 17(15.7%) 


8 (7.4%) 


erbBl 


5 (4.6%) 


87 (80.6%) 8 (7.4%) 


8 (7.4%) 



reagents and requires relatively large amounts of high-quality 
genomic DNA, which means it cannot be used routinely in many 
laboratories. An amplification step is therefore required to deter- 
mine the copy number of a given target gene from minimal 
quantities of tumor DNA (small early-stage tumors, cytopuncture 
specimens or formalin-fixed, paraffin-embedded tissues). 

In this study, we validated a PCR method developed for the 
quantification of gene over-representation in tumors. The method, 
based on real-time analysis of PCR amplification, has several 
advantages over other PCR-based quantitative assays such as 
competitive quantitative PCR (Celi et ai, 1 994). First, the real-time 
PCR method is performed in a closed-tube system, avoiding the 
risk of contamination by amplified products. Re-amplification of 
carryover PCR products in subsequent experiments can also be 
prevented by using the enzyme uracil N-glycosylase (UNG) 
(Longo et ai, 1990). The second advantage is the simplicity and 
rapidity of sample analysis, since no post-PCR manipulations are 
required. Our results show that the automated method is reliable. 
We found it possible to determine, in triplicate, the number of 
copies of a target gene in more than 100 tumors per day. Third, the 
system has a linear dynamic range of at least 4 orders of magnitude, 
meaning that samples do not have to contain equal starting amounts 
of DNA. This technique should therefore be suitable for analyzing 
formalin-fixed, paraffin-embedded tissues. Fourth, and above all, 
real-time PCR makes DNA quantification much more precise and 
reproducible, since it is based on C, values rather than end-point 
measurement of the amount of accumulated PCR product. Indeed, 
the ABI Prism 7700 Sequence Detection System enables C t to be 
calculated when PCR amplification is still in the exponential phase 
and when none of the reaction components is rate-limiting. The 
within-run CV of the C t value for calibrator human DNA (5 
replicates) was always below 5%, and the between-assay precision 
in 5 different runs was always below 10% (data not shown). In 
addition, the use of a standard curve is not absolutely necessary, 
since the copy number can be determined simply by comparing the 
C, ratio of the target gene with that of reference genes. The results 
obtained by the 2 methods (with and without a standard curve) are 
similar in our experiments (data not shown). Moreover, unlike 
competitive quantitative PCR, real-time PCR does not require an 
internal control (the design and storage of internal controls and the 
validation of their amplification efficiency is laborious). 

The only potential disavantage of real-time PCR, like all other 
PCR-based methods and solid-matrix blotting techniques (South- 
em blots and dot blots) is that is cannot avoid dilution artifacts 
inherent in the extraction of DNA from tumor cells contained in 
heterogeneous tissue specimens. Only FISH and immunohistochem- 
istry can measure alterations on a cell-by-cell basis (Pauletti et ai, 
1996; Slamon et ai, 1989). However, FISH requires expensive 
equipment and trained personnel and is also time-consuming. 
Moreover, FISH does not assess gene expression and therefore 
cannot detect cases in which the gene product is over-expressed in 
the absence of gene amplification, which will be possible in the 
future by real-time quantitative RT-PCR. Immunohistochemistry is 
subject to considerable variations in the hands of different teams, 
owing to alterations of target proteins during the procedure, the 
different primary antibodies and fixation methods used and the 
criteria used to define positive staining. 

The results of this study are in agreement with those reported in 
the literature. (/) Chromosome regions 4qll-ql3 and 21q21.2 
(which bear alb and app, respectively) showed no genetic alter- 
ations in the breast-cancer samples studied here, in keeping with 
the results of CGH (Kallioniemi et ai, 1994). (ii) We found that 
amplifications of these 3 oncogenes were independent events, as 
reported by other teams (Berns et ai, 1 992; Borg et ai, 1992). (Hi) 
The frequency and degree of myc amplification in our breast tumor 
DNA series were lower than those of ccndl and erbBl amplifica- 
tion, confirming the findings of Borg et ai (1992) and Courjal et ai 
(1997). (iv) The maxima of ccndl and erbBl over-representation 
were 18-fold and 15-fold, also in keeping with earlier results (about 
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Figure 2 - ccndl and alb gene dosage by real-time PCR in 3 breast tumor samples: Tl 1 8 (El 2, C6, black squares), T133 (GII.B4, red squares) 
andTl45 (A8, C8, blue squares). Given the C, of each sample, the initial copy number is inferred from the standard curve obtained during the same 
experiment. Triplicate plots were performed for each lumor sample, but the data for only one are shown here. The results are shown in Table II. 



30-fold maximum) (Bernserj/., 1 992; Borg et al, 1992; Courjal et 
aL, 1 997). (v) The er6B2 copy numbers obtained with real-time 
PCR were in good agreement with data obtained with other 
quantitative PCR-based assays in terms of the frequency and 
degree of amplification (An et aL, 1 995; Deng et aL, 1996; Valeron 



et aL, 1996). Our results also correlate well with those recently 
published by Gelmini et al ( 1 997), who used the TaqMan system to 
measure erbBl amplification in a small series of breast tumors 
(n = 25), but with an instrument (LS-50B luminescence spectrom- 
eter, Perkin-Elmer Applied Biosystems) which only allows end- 
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TABLE II - EXAMPLES OF ccndl GENE DOSAGE RESULTS 
FROM 3 BREAST TUMORS' 



Tumor 




ccndl 






alb 




Nccndl/alb 


Copy 
number 


Mean 


SD 


Copy 
number 


Mean 


SD 


T118 


4525 






4223 










4605 


4603 


77 


4365 


4325 


89 


1.06 




4678 






4387 








T133 


59821 






9787 










61659 


61100 


1111 


10092 


10137 


375 


6.03 




61821 






10533 








T145 


128563 






7321 










125892 


125392 


3448 


7762 


7672 


316 


16.34 




121722 






7933 









'For each sample, 3 replicate experiments were performed and the mean 
and the standard deviation (SD) was determined. The level of ccndl gene 
amplification (Hccndlialb) is determined by dividing the average ccndl 
copy n amber value by the average alb copy number value. 



point measurement of fluorescence intensity. Here we report myc 
and ccndl gene dosage in breast cancer by means of quantitative 
PCR. (vi) We found a high degree of concordance between 
real-time quantitative PCR and Southern blot analysis in terms of 
gene amplification, especially for samples with high copy numbers 
(>5-foId). The slightly higher frequency of gene amplification 
(especially ccndl and erbB2) observed by means of real-time 
quantitative PCR as compared with Southern-blot analysis may be 
explained by the higher sensitivity of the former method. However, 
we cannot rule out the possibility that some tumors with a few extra 



gene copies observed in real-time PCR had additional copies of an 
arm or a whole chromosome (trisomy, tetrasomy or polysomy) 
rather than true gene amplification. These 2 types of genetic 
alteration (polysomy and gene amplification) could be easily 
distinguished in the future by using an additional probe located on 
the same chromosome arm, but some distance from the target gene. 
It is noteworthy that high gene copy numbers have the greatest 
prognostic significance in breast carcinoma (Borg et al, 1992; 
Slamon a/,, 1987). 

Finally, this technique can be applied to the detection of gene 
deletion as well as gene amplification. Indeed, we found a 
decreased copy number of erbB2 (but not of the other 2 proto- 
oncogenes) in several tumors; erbBl is located in a chromosome 
region (17q21) reported to contain both deletions and amplifica- 
tions in breast cancer (Bieche and Lidereau, 1995). 

In conclusion, gene amplification in various cancers can be used 
as a marker of pre-neoplasia, also for early diagnosis of cancer, 
staging, prognostication and choice of treatment. Southern blotting 
is not sufficiently sensitive, and FISH is lengthy and complex. 
Real-time quantitative PCR overcomes both these limitations, and 
is a sensitive and accurate method of analyzing large numbers of 
samples in a short time. It should find a place in routine clinical 
gene dosage. 

ACKNOWLEDGEMENTS 

RL is a research director at the Institut National de la Sante et de 
la Recherche Medicale (INSERM). We thank the staff of the Centre 
Rene Huguenin for assistance in specimen collection and patient 
care. 



REFERENCES 



AN, H.X., NlEDERACHER, D., BECKMANN, M.W., GOHRING, U.J., SCHARL, A., 

Picard, R, Van Roeyen, C, Schnurch, H.G. and Bender, H.G., erbBl 
gene amplification detected by fluorescent differential polymerase chain 
reaction in paraffin-embedded breast carcinoma tissues, int. J. Cancer 
(Pred. Oncol), 64, 291-297 (1995). 

Berns, E.M.J.J., Klijn, J.G.M., Van Putten, W.L.J., Van Staveren, I.L., 
Portengen, H. and Foekens, J. A., c-myc amplification is a better prognos- 
tic factor than HERUneu amplification in primary breast cancer. Cancer 
Res., SI, 1107-1113(1992). 

Bieche, I. and Lidereau, R., Genetic alterations in breast cancer. Genes 
Chrom. Cancer, 14,227-251 (1995). 

Borg, A., Baldetorp, B., Ferno, M., Olsson, H. and Sigurdsson, H., 
Q-myc amplification is an independent prognostic factor in post-menopausal 
breast cancer. int. J. Cancer, 51, 687-691 (1992). 

Celi, F.S., Cohen, M.M., Antonarakis, S.E., Wertheimer, E., Roth, J. 
and Shuldiner, A.R., Determination of gene dosage by a quantitative 
adaptation of the. polymerase chain reaction (gd-PCR): rapid detection of 
deletions and duplications of gene sequences. Genomics, 21, 304-310 
(1994). 

Courjau F., Cuny, M, Simony -Lafontaine, J., Louasson, G., Speiser, P., 
Zeillinger, R., Rodriguez, C. and Theillet, C, Mapping of DNA 
amplifications at 15 chromosomal localizations in 1875 breast tumors: 
definition of phenotypic groups. Cancer Res., 57, 4360-4367 (1997). 

Deng, G., Yu, M., Chen, L.C., Moore, D., Kurisu, W., Kallioniemi, A., 
Waldman, F.M., Collins, C and Smith, H.S., Amplifications of oncogene 
erbB-2 and chromosome 20q in breast cancer determined by differentially 
competitive polymerase chain reaction. Breast Cancer Res. Treat., 40, 
271-281 (1996). 

Gelmini, S., Oriando, C, Sesttni, R., Von a, G., Pinzani, P., Ruocco, L. 
and PazZagli, M., Quantitative polymerase chain reaction -based homoge- 
neous assay with fluorogenic probes to measure c-erB-2 oncogene amplifi- 
cation. Clin. Chem., 43, 752-758 (1997). 

Gibson, U.E.M., Heid, C.A. and Williams, P.M., A novel method for 
real-time quantitative RT-PCR. Genome Res., 6, 995-1001 (1996), 

Heid, C.A., Stevens, J m Livak, KJ. and Williams, P.M., Real-time 
quantitative PCR. Genome Res., 6, 986-994 ( 1 996). 

Holland, P.M., Abramson, R.D., Watson, R. and Gelfand, D.H., 
Detection of specific polymerase chain reaction product by utilizing the 5' 
to 3' exonuclease activity of Thermus aquaticus DNA polymerase. Proc. 
nat. Acad. Sci. (Wash.), 88, 7276-7280 (1991). 



Kallioniemi, A.. Kallioniemi, O.P., Piper, J., Tanner, M., Stokkes, T, 
Chen, L., Smith, H.S., Ptnkel, D., Gray, J.W. and Waldman, F.M., 
Detection and mapping of amplified DNA sequences in breast cancer by 
comparative genomic hybridization. Proc. nat. Acad. Sci. (Wash.), 91, 
2156-2160(1994). 

Lee, L.G., Connell, CR. and Bioch, W., Allelic discrimination by 
nick-translation PCR with fluorogenic prcbe. Nucleic Acids Res., 21, 
3761-3766(1993). 

Longo, N., BERNrNGER, N.S. and Hartley, J.L., Use of uracil DNA 
glycosylase to control carry-over contamination in polymerase chain 
reactions. Gene, 93, 125-128 (1990). 

Muss, H.B., Thor, A.D., Berry, D.A., Kute, T., Liu, E.T., Koerner, F., 
Cirrincione, C.T., Budman, D.R., Wood, W.C., Barcos, M. and Hender- 
son, I.C., c-erbB-2 expression and response to adjuvant therapy in women 
.with node-positive early breast cancer. New Engl. J. Med., 330, 1 260-1266 
(1994). 

Pauletti, G., Godolphtn, W., Press, M.F and Salmon, D.J., Detection and 
quantification of HER-2/neu gene amplification in human breast cancer 
archival material using fluorescence in situ hybridization. Oncogene, 13, 
63-72(1996). 

Piatak, M., Luic, K.C., Williams, B. and Lifson, J.D., Quantitative 
competitive polymerase chain reaction for accurate quantitation of HIV 
DNA and RN A species. Biotechniques, 1 4, 70-80 ( 1 993). 

ScHUURrNG, E., Verhoeven, E., Van Tinteren, H., Peterse, J.L., Nunntk, 
B., Thunnissen, F.B.J.M., Devilee, P., Cornelisse, C.J., Van de Vijver, 
M.J., Mooi, W.J. and Michalides, R.J.A.M., Amplification of genes within 
the chromosome llql3 region is indicative of poor prognosis in patients 
with operable breast cancer. Cancer Res., 52, 5229-5234 (1992). 

Slamon, D.J., Clark, G.M., Wong, S.G., Levin, W.S., Ullrich, A. and 
McGuiRE, W.L., Human breast cancer: correlation of relapse and survival 
with amplification of the HER-2/ne« oncogene. Science, 235, 177-182 
(1987). 

Slamon, D.J., Godolphin, W., Jones, L.A., Holt, J. A., Wong, S.G., Keith, 
D.E., Levin, W.J., Stuart, S.G., Udove, J., Ullrich, A. and Press, M.F., 
Studies of the HER-2/new proto-oncogene in human breast and ovarian 
cancer. Science. 244, 707-7 12 (1 989). 

Valeron, P.F., Chirino, R., Fernandez, L., Torres, S., Navarro, D., 
Aguiar, J., Cabrera, J. J., Diaz-Chico, B.N. and Diaz-Chico, J.C., 
Validation of a differential PCR and an EL1SA procedure in studying 
HER-2/m»u status in breast cancer int. J. Cancer, 65, 129-133 (1996). 




IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicant 



Ashkenazi et al. 



Group Art Unit 1647 



CERTIFICATE OF EXPRESS MAILING 



App. No. 



09/903,925 



Filed 



July 11,2001 



I hereby certify that this correspondence is 
being deposited with the United States 



For 



SECRETED AND 
TRANSMEMBRANE 
POLYPEPTIDES AND NUCLEIC 
ACIDS ENCODING THE SAME 



Postal Service with sufficient postage as 
first class mail in an envelope addressed to 
Commissioner of Patents, Washington 



D.C. 20231 on: 



(Date) 



Examiner : Hamud, Fozia M 



Commissioner of Patents 

P.O. Box 1450 

Alexandria, VA 22313-1450 

DECLARATION OF AVI ASHKENAZI. Ph.D UNDER 37 C.F.R. $1.132 

I, Avi Ashkenazi, Ph.D. declare and say as follows: - 

1 . I am Director and Staff Scientist at the Molecular Oncology Department of 
Genentech, Inc., South San Francisco, CA 94080. 

2. I joined Genentech in 1988 as a postdoctoral fellow. Since then, I have 
investigated a variety of cellular signal transduction mechanisms, including apoptosis, and have 
developed technologies to modulate such mechanisms as a means of therapeutic intervention in 
cancer and autoimmune disease. I am currently involved in the investigation of a series of 
secreted proteins over-expressed in tumors, with the aim to identify useful targets for the 
development of therapeutic antibodies for cancer treatment. 

3. My scientific Curriculum Vitae, including my list of publications, is attached to 
and forms part of this Declaration (Exhibit A). 

4. Gene amplification is a process in which chromosomes undergo changes to 
contain multiple copies of certain genes that normally exist as a single copy, and is an important 
factor in the pathophysiology of cancer. Amplification of certain genes (e.g.* Myc or Her2/Neu) 




gives cancer cells a growth or survival advantage relative to normal cells, and might also provide 
a mechanism of tumor cell resistance to chemotherapy or radiotherapy. . 

5. If gene amplification results in over-expression of the mRNA and the 
corresponding gene product, then it identifies that gene product as a promising target for cancer 
therapy, for example by the therapeutic antibody approach. Even in the absence of over- 
expression of the gene product, amplification of a cancer marker gene - as detected* for example, 
by the reverse transcriptase TaqMan® PGR or the fluorescence in situ hybridization (FISH) 
assays -is useful in the diagnosis or classification of cancer, or in predicting or monitoring the 
efficacy of cancer therapy. An increase in gene copy number can result not only from 
intrachromosomal changes but also from chromosomal aneuploidy. It is important to understand 
that detection of gene amplification can be used for cancer diagnosis even if the determination 
includes measurement of chromosomal aneuploidy. Indeed, as long as a significant difference 
relative to normal tissue is detected, it is irrelevant if the signal originates from an increase in the 
number of gene copies per chromosome and/or an abnormal number of chromosomes. 

6. I understand that according to the Patent Office, absent data demonstrating that 
the increased copy number of a gene in certain types of cancer leads to increased expression of 
its product, gene amplification data are insufficient to provide substantial utility or well 
established utility for the gene product (the encoded polypeptide), or an antibody specifically 
binding the encoded polypeptide. However, even when amplification of a cancer marker gene 
does not result in significant over-expression of the corresponding gene product, this very 
absence of gene product over-expression still provides significant information for cancer 
diagnosis and treatment. Thus, if over-expression of the gene product does not parallel gene 
amplification in certain tumor types but does so in others, then parallel monitoring of gene 
amplification and gene product over-expression enables more accurate tumor classification and 
hence better determination of suitable therapy. In addition, absence of over-expression is crucial 
information for the practicing clinician. If a gene is amplified but the corresponding gene 
product is not over-expressed, the clinician accordingly will decide not to treat a patient with 
agents that target that gene product. 

7. I hereby declare that all statements made herein of my own knowledge are true 
and that all statements made on information or belief are believed to be true, and further that 
these statements were, made with the knowledge that willful false statements and the like so 



made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful statements may jeopardize the validity of the 
application or any patent issued thereon. 
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1983: 
1986: 



B.S. in Biochemistry, with honors, Hebrew University, Israel 
Ph.D. in Biochemistry, Hebrew University, Israel 
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1985-1986: 
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Teaching assistant, undergraduate level course in Biochemistry 
Teaching assistant, graduate level course on Signal Transduction 
Postdoctoral fellow, Hormone Research Dept., UCSF, and 
Developmental Biology Dept., Genentech, Inc., with J. Ramachandran 
Postdoctoral fellow, Molecular Biology Dept., Genentech, Inc., 
with D. Capon 

Scientist, Molecular Biology Dept., Genentech, Inc. 
Senior Scientist, Molecular Oncology Dept., Genentech, Inc. 
Senior Scientist and Interim director, Molecular Oncology Dept., 
Genentech, Inc. 

Senior Scientist and preclinical project team leader, Genentech, Inc. 

Staff Scientist in Molecular Oncology, Genentech, Inc. 

Staff Scientist and Director in Molecular Oncology, Genentech, Inc. 
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1988: 



First prize, The Boehringer Ingelheim Award 
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Editorial: 

Editorial Board Member: Current Biology 
Associate Editor, Clinical Cancer Research. 
Associate Editor, Cancer Biology and Therapy. 

Refereed papers: 

1 . Gertler, A., Ashkenazi. A., and Madar, Z. Binding sites for human growth 
hormone and ovine and bovine prolactins in the mammary gland and liver of the 
lactating cow. Mol. Cell. Endocrinol. 34, 51-57 (1 984). 

2. Gertler, A., Shamay, A., Cohen, N., Ashkenazi. A.. Friesen, H., Levanon, A., 
Gorecki, M„ Aviv, H., Hadari, D., and Vogel, T. Inhibition of lactogenic 
activities of ovine prolactin and human growth hormone (hGH) by a novel form of 
a modified recombinant hGH. Endocrinology 118, 720-726 (1986). 

3. Ashkenazi. A., Madar, Z., and Gertler, A. Partial purification and characterization 
of bovine mammary gland prolactin receptor. Mol. Cell. Endocrinol 50, 79-87 
(1987). 

4. Ashkenazi. A.. Pines, M., and Gertler, A. Down-regulation of lactogenic 
hormone receptors in Nb2 lymphoma cells by cholera toxin. Biochemistry 
Intematl 14, 1065-1072 (1987). 

5. Ashkenazi. A.. Cohen, R., arid Gertler, A. Characterization of lactogen receptors 
in lactogenic hormone-dependent and independent Nb2 lymphoma cell lines. 
FEBSLett. 210, 51-55 (1987). 

6. Ashkenazi. A.. Vogel, T., Barash, I., Hadari, D., Levanon, A., Gorecki, M., and 
Gertler, A. Comparative study on in vitro and in vivo modulation of lactogenic 
and somatotropic receptors by native human growth hormone and its modified 
recombinant analog. Endocrinology 121, 414-419 (1987). 

7. Peralta, E., Winslow, J., Peterson, G., Smith, D., Ashkenazi. A.. Ramachandran, 
J., Schimerlik, M., and Capon, D. Primary structure and biochemical properties 
of an M2 muscarinic receptor. Science 236, 600-605 (1987). 

8. Peralta, E. Ashkenazi. A.. Winslow, J., Smith, D., Ramachandran, J., and Capon, 
D. J. Distincnt primary structures, ligand-binding properties and tissue-specific 
expression of four human muscarinic acetylcholine receptors. EMBO J. 6, 3923- 
3929(1987). 

9. Ashkenazi. A.. Winslow, J., Peralta, E., Peterson, G., Schimerlik, M., Capon, D., 
and Ramachandran, J. An M2 muscarinic receptor subtype coupled to both 
adenylyl cyclase and phosphpinositide turnover. Science 238, 672-675 (.1987). 
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1 0. Pines, M., Ashkenazi. A., Cohen-Chapnik, N., Binder, L., and Gertler, A. 
Inhibition of the proliferation of Nb2 lymphoma cells by femtomolar 
concentrations of cholera toxin and partial reversal of the effect by 1 2-o- 
tetradecanoyl-phorbol-13-acetate.y. Cell. Biochem. 37, 119-129 (1988). 

1 1 . Peralta, E. Ashkenazi. A., Winslow, J. Ramachandran, J., and Capon, D. 
Differential regulation of PI hydrolysis and adenylyl cyclase by muscarinic 
receptor subtypes. Nature 334, 434-437 (1988). 

12. Ashkenazi.. A. Peralta, E., Winslow, J., Ramachandran, J., and Capon, D. 
Functionally distinct G proteins couple different receptors to PI hydrolysis in the 
same cell. Cell 56, 487-493 (1989). 

13. Ashkenazi, A., Ramachandran, J., and Capon, D. Acetylcholine analogue 
stimulates DNA synthesis in brain-derived cells via specific muscarinic 
acetylcholine receptor subtypes. Nature 340, 146-150(1989). 

14. Lammare, D., Ashkenazi. A., Fleury, S., Smith, D., Sekaly, R., and Capon, D. 
The MHC-binding and gpl20-bindirig domains of CD4 are distinct and separable. 
Science 245, 743-745 (1989). 

15. Ashkenazi.. A., Presta, L., Marsters, S., Camerato, T., Rosenthal, K., Fendly, B:, 
and Capon, D. Mapping the CD4 binding site for human immunodefficiency 
virus type 1 by alanine-scanning mutagenesis. Prop. Natl. Acad. Sci. USA. 87, 
7150-7154(1990). 

16. Chamow, S., Peers, D., Byrn, R., Mulkerrin, M., Harris, R., Wang, W., Bjorkman, 
P., Capon, D., and Ashkenazi. A. Enzymatic cleavage of a CD4 immunpadhesin 
generates crystallizable, biologically active Fd-like fragments. Biochemistry 29, 
9885-9891 (1990). 

17. Ashkenazi. A.. Smith, D., Marsters, S., Riddle, L., Gregory, T., Ho, D., and 
Capon, D. Resistance of primary isolates of human immunodefficiency virus type 
1 to soluble CD4 is independent of CD4-rgp 120 binding affinity. Proc. Natl. 
Acad. Sci. USA. 88, 7056-7060 (1991). 

18. Ashkenazi. A.. Marsters, S., Capon, D., Chamow, S., Figari., I., Pennica, D., 
Goeddel., D., Palladino, M., and Smith, D. Protection against endotoxic shock by 
a tumor necrosis factor receptor immunoadhesin. Proc. Natl. Acad. Sci. USA. 88, 
10535-10539(1991). 

19. Moore, J., McKeating, J., Huang, Y., Ashkenazi. A ., and Ho, D. Virions of 
primary HIV-1 isolates resistant to sCD4 neutralization differ in sCD4 affinity and 
glycoprotein gpl20 retention from sCD4-sensitive isolates. /. Virol. 66, 235-243 
(1992). 
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20. Jin, H., Oksenberg, D., Ashkenazi. A.. Peroutka, S., Duncan, A., RozmaheL, R., 
Yang, Y., Mengod, G., Palacios, J., and ODowd, B. Characterization of the 
human 5-hydroxytryptamineiB receptor. /. Biol. Chem. 267, 5735-5738 (1992). 

21. Marsters, A., Frutkin, A., Simpson, N., Fendly, B. and Ashkenazi, A. 
Identification of cysteine-rich domains of the type 1 tumor necrosis receptor 
involved in ligand binding. /. Biol. Chem. 267, 5747-5750 (1992). 

22. Chamow, S., Kogan, T., Peers, D., Hastings, R., Byrn, R., and Ashkenazi, A. 
Conjugation of sCD4 without loss of biological activity via a novel carbohydrate- 
directed cross-linking reagent. J. Biol. Chem. 267, 15916-15922 (1992). 

23. Oksenberg, D., Marsters, A., ODowd, B., Jin, H., Havlik, S., Peroutka, S., and 
Ashkenazi, A. A single amino-acid difference confers major pharmacologic 
variation between human and rodent 5-HTiB receptors. Nature 360, 161-163 

(1992) . 

24. Haak-Frendscho, M., Marsters, S., Chamow, S., Peers, D., Simpson, N., and 
Ashkenazi, A. Inhibition of interferon y by an interferon y receptor 
immunoadhesin. Immunology 79, 594-599 (1993). 

25. Penica, D., Lam, V., Weber, R., Kohr, W., Basa, L., Spellman, M., Ashkenazi, 
Shire, S., and Goeddel, D. Biochemical characterization of the extracellular 
domain of the 75-kd tumor necrosis factor receptor. Biochemistry 32, 3131-3138. 

(1993) . 

26. Barfod, L., Zheng, Y., Kuang, W., Hart, M., Evans, T., Cerione, R., and 
Ashkenazi. A. Cloning and expression of a human CDC42 GTPase Activating 
Protein reveals a functional SH3-binding domain. /. Biol. Chem. 268, 26059- 
26062(1993). 

27. Chamow, S., Zhang, D., Tan, X., Mhtre, S., Marsters, S., Peers, D., Byrn, R., 
Ashkenazi, A., and Yunghans, R. A humanized bispecific immunoadhesin- 
antibody that retargets CD3+ effectors to kill HTV-1 -infected cells. J. Immunol 
153,4268-4280(1994). 

28. Means, R., Krantz, S., Luna, J., Marsters, S., and Ashkenazi, A. Inhibition of 
murine erythroid colony formation in vitro by iterferon y and correction by 
interferon y receptor immunoadhesin. Blood 83, 91 1-915 (1994). 

29. Haak-Frendscho, M., Marsters, S., Mordenti, J., Gillet, N., Chen, S., 
and Ashkenazi, A. Inhibition of TNF by a TNF receptor immunoadhesin: 
comparison with an anti-TNF mAb. J. Immunol. 152, 1347-1353 (1994). 
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30. Chamow, S., Kogan, T., Venuti, M., Gadek, T., Peers, D., Mordenti, J., Shak, S., 
and Ashkenazi. A. Modification of CD4 immunoadhesin with monomethoxy- 
PEG aldehyde via reductive alkilation. Bioconj. Cherru 5, 133-140 (1994). 

31. Jin, H.V Yang, R., Marsters, S., Bunting, S., Wuim, F., Oiamow, S., and 
Ashkenazi. A. Protection against rat endotoxic shock by p55 tumor necrosis factor 
(TNF) receptor immunoadhesin: comparison to anti-TNF monoclonal antibody. J. 
Infect. Diseases 170, 1323-1326 (1994). 

32. Beck, J., Marsters, S., Harris, R., Ashkenazi. A., and Chamow, S. Generation of 
soluble interleukin-1 receptor from an immunoadhesin by specific cleavage. Mol. 
Immunol 31,1335-1344(1994). 

33. Pitti, B., Marsters, M., Haak-Frendscho, M., Osaka, G., Mordenti, J., Chamow, $., 
and Ashkenazi. A. Molecular and biological properties of an interleukin- 1 
receptor immunoadhesin. Mol. Immunol. 31, 1345-1351 (1994). 

34. OVsenher^ T) Havlik. S.. Peroutka. S.. and Ashkenazi. A. The third intracellular 
loop of the 5-HT2 receptor specifies effector coupling. J. Neurochem. 64, 1440- 
1447 (1995). 

35. Bach, E., Szabo, S., Dighe, A., Ashkenazi. A.. Aguet, M., Murphy, K., and 
Schreiber, R. Ligand-induced autoregulation of JJFN-y receptor p chain expression 
in T helper cell subsets. Science 270, 1215-1218 (1995). 

36. Jin, H., Yang, R., Marsters, S., Ashkenazi. A.. Bunting, S., Marra, M., Scott, R., 
and Baker, J. Protection against endotoxic shock by bactericidal/permeability- 
increasing protein in rats. J. Clin. Invest. 95, 1947-1952 (1995). 

37. Marsters, S., Penica, D., Bach, E., Schreiber, R., and Ashkenazi. A. Interferon y 
signals via a high-affinity multisubunit receptor complex that contains two types 
of polypeptide chain. Proc. Natl. Acad. Sci. USA. 92, 5401-5405 (1995). 

38. Van Zee, K., Moldawer, L., Oldenburg, H., Thompson, W., Stackpole, S., 
Montegut, W., Rogy, M., Meschter, C, Gallati, H., Schiller, C, Richter, W., 
Loetcher, H., Ashkenazi. A .. Chamow, S., Wurm, F., Calvano, S., Lowry, S., and 
Lesslauer, W. Protection against lethal E. coli bacteremia in baboons by 
pretreatmeht with a 55-kDa TNF receptor-Ig fusion protein, Ro45-2081. J: 
Immunol. 156, 2221-2230 (1996). 

39. Pitti, R., Marsters, S., Ruppert, S., Donahue, C, Moore, A., and Ashkenazi. A . 
Induction of apoptosis by Apo-2 Ligand, a new member of the tumor necrosis 
factor cytokine family. J. Biol. Chem. Ill, 12687-12690 (1996). 
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40. Marsters, S., Pitti, R., Donahue, C, Rupert, S ., Bauer, K., and Ashkenazi, A. 
Activation of apoptosis by Apo-2 ligand is independent of FADD but blocked by 
CrmA: Curr. Biol. 6, 1669-1676 (1996). 

41. Marsters, S., Skubatch, M., Gray, C, and Ashkenazi. A . Herpesvirus entry 
mediator, a novel member of the tumor necrosis factor receptor family, activates 
the NF-kB and AP-1 transcription factors. J. Biol. Chem. 272, 14029-14032 
(1997). ' 

42. Sheridan, J., Marsters, S., Pitti, R., Gumey, A., Skubatch, M„ Baldwin, D., 
Ramakrishnan, L., Gray, C, Baker, K., Wood, W.I., Goddard, A., Godowski, P., and 
Ashkenazi. A. Control of TRAIL-induced apoptosis by a family of signaling and 
decoy receptors. Science 111, 818-821 (1997). 

43. Marsters, S., Sheridan, J., Pitti, R., Gumey, A., Skubatch, M., Balswin, D., Huang, A., 
Yuan, J., Goddard, A., Godowski, P., and Ashkenazi. A. A novel receptor for 
Apo2L/TRAIL contains a truncated death domain. Curr. Biol. 7, 1003-1006 (1997). 

44. Marsters, A., Sheridan, J., Pitti, R., Brush, J., Goddard, A., and Ashkenazi. A. 
Identification of a ligand for the deafo-domain-containing receptor Apo3. Curr. Biol. 
8, 525-528 (1998). 

45. Rieger, J., Naumann, U., Glaser, T., Ashkenazi. A ., and Weller, M. Apo2 ligand: 
a novel weapon against malignant glioma? FEBS Lett 427, 124-128 (1998). 

46. Pender, S., Fell, J., Chamow, S., Ashkenazi. A ., and MacDonald, T. A p55 TNF 
receptor immunoadhesin prevents T cell mediated intestinal injury by inhibiting 
matrix metalloproteinase production. J. Immunol. 160, 4098-4103 (1998). 

47. Pitti, R., Marsters, S., Lawrence, D., Roy, Kischkel, F., M., Dowd, P., Huang, A., 
Donahue, C, Sherwood, S., Baldwin, D., Godowski, P., Wood, W., Gumey, A., 
Hillan, K., Cohen, R, Goddard, A., Botstein, D., and AshkenazLA. Genomic 
amplification of a decoy receptor for Fas ligand in lung and colon cancer. Nature 
396,699-703(1998). 

48. Mori, S., Marakami-Mori, K., Nakamura, S., Ashkenazi. A ., and Bonavida, B. 
Sensitization of AIDS Kaposi's sarcoma cells to Apo-2 ligand-induced apoptosis 
by actinomycin D. J. Immunol. 162, 5616-5623 (1999). 

49. Gumey, A. Marsters, S., Huang, A., Pitti, R., Mark, M., Baldwin, D., Gray, A., 
Dowd, P., Brush, J., Heldens, S., Schow, P., Goddard, A., Wood, W., Baker, K., 
Godowski. P.. and Ashkenazi. A. Identification of a new member of the tumor 
necrosis factor family and its receptor, a human ortho log of mouse GITR. Curr. 

Biol. 9, 215-218 (1999). 
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50. Ashkenazi. A ., Pai, R., Fong, s., Leung, S., Lawrence, D., Marsters, S., Blackie, 
C, Chang, L., McMurtrey, A., Hebert, A., DeForge, L., Khoumenis, L, Lewis, D., 
Harris, L., Bussiere, J., Koeppen, H„ Shahrokh, Z., and Schwall, R. Safety and 
anti-tumor activity of recombinant soluble Apo2 ligand. J. Clin. Invest. 104, 155- 
162(1999). 

51 . Chuntharapai, A., Gibbs, V., Lu, J., Ow, A., Marsters, S., Ashkenazi, A., De Vos, 
A., Kim, K J. Determination of residues involved in ligand binding and signal 
transmissiion in the human IFN-a receptor 2. J. Immunol. 163, 766-773 (1 999). 

52. Johnsen, A.-C, Haux, J., Steinkjer, B., Nonstad, U., Egeberg, K., Sundan, A., 
Ashkenazi. A., and Espevik, T. Regulation of Apo2L/TRAlL expression in NK 

. cells - involvement in NK cell-mediated cytotoxicity. Cytokine 11, 664-672 

(1999). 

53. Roth, W., Isenmann, S., Naumann, U., Kugler, S., Bahr, M., Dichgans, L, 

. Ashkenazi, A., and Weller, M. Eradication of intracranial human malignant 
glioma xenografts by Apo2L/TRAIL. Biochem. Biophys. Res. Commun. 265, 479- 
483(1999). 

54. Hymowitz, S.G., Christinger, H.W., Fuh, G., Ultsch, M., O'Connell, M., Kelley, 
R.F., Ashkenazi, A. and de Vos, A.M. Triggering Cell Death: The Crystal 
Structure of Apo2L/TRAIL in a Complex with Death Receptor 5. Molec. Cell 4, 
563-571 (1999). 

55. Hymowitz, S.G., O'Connel, M.P., Utsch, M.H., Hurst, A., Totpal, K., Ashkenazi, 
A,, de Vos, A.M., Kelley, R.F. A unique zinc-binding site revealed by a high- 
resolution X-ray structure of homotrimeric Apo2L/TRAIL. Biochemistry 39, 633- 
640(2000). 

56. Zhou, Q., Fukushima, P., DeGraff, W., Mitchell, J.B., Stetler-Stevenson, M., 
Ashkenazi- A., and Steeg, P.S. Radiation and the Apo2L/TRAIL apoptotic 
pathway preferentially inhibit the colonization of premalignant human breast 
cancer cells overexpressing cyclin D 1 . Cancer Res. 60, 26 1 1-26 1 5 (2000). 

57. Kischkel, F.C., Lawrence, D. A., Chuntharapai, A., Schow, P., Kim, J., and 
Ashkenazi, A Apo2L/TRAIL-dependent recruitment of endogenous FADD and 
Caspase-8 to death receptors 4 and 5. Immunity 12, 61 1-620 (2000). 

58. Yan, M., Marsters, S.A., Grewal, I.S., Wang, H., * Ashkenazi. A. , and *Dixit, 
V.M. Identification of a receptor for BlyS demonstrates a crucial role in humoral 
immunity. Nature Immunol. 1,37-41 (2000). 
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59. Marsters, S.A., Yan, M., Pitti, R.M., Haas, P.E., Dixit, V.M., and Ashkenazi. A. 
' Interaction of the TNF homologues BLyS and APRIL with the TNF receptor 
homologues BCMA and TACI. Curr. Biol. 10, 785-788 (2000). 

60. Kischkel, F.C., and Ashkenazi. A . Combining enhanced metabolic labeling with 
immunoblotting to detect interactions of endogenous cellular proteins. 
Biotechniques 29, 506-512 (2000). 

61 . Lawrence, D., Shahrokh, Z., Marsters, S., Achilles, K., Shih, D. Mounho, B., 
Hillan, K., Totpal, K. DeForge, L., Schow, P., Hooley, J., Sherwood, S., Pai, R., 
Leung, S., Khan, L., Gliniak, B., Bussiere, J., Smith, C, Strom, S., Kelley, S., 
Fox, J., Thomas, D., and Ashkenazi. A. Differential hepatocyte toxicity of 
recombinant Apo2L/TRAIL versions. Nature Med. 7, 383-385 (2001). 

62. ., Chuntharapai, A., Dodge, K., Grimmer, K., Schroeder, K., Martsters, S.A., 

Koeppen, H., Ashkenazi. A ., and Kim, K.J. Isotype-dependent inhibition of 
tumor growth in vivo by. monoclonal antibodies to death receptor 4. J. Immunol. 
166,4891-4898 (2001). 

63. Pollack, I.F.j Erff. M.. and Ashkenazi. A . Direct stimulation of apoptotic 
signaling by soluble Apo2L/tumor necrosis factor-related apoptosis-inducing 
ligand leads to selective killing of glioma cells. Clin. Cancer Res. 7, 1362-1369 
(2001). 

64. Wang, H., Marsters, S.A., Baker, T., Chan, B., Lee, W.P., Fu, L., Tumas, D., Yan, 
M., Dixit, V.M., * Ashkenazi. A ., and *Grewal, I.S. TACI-ligand interactions are 
required for T cell activation and collagen-induced arthritis in mice. Nature 
Immunol. 2, 632-637 (2001): 

65. Kischkel, F.C., Lawrence, D. A., Tinel, A., Virmani, A., Schow, P., Gazdar, A., 
Blenis, J., Arnott, D., and Ashkenazi. A . Death receptor recruitment of 
endogenous caspase-10 and apoptosis initiation in the absence of caspase-8. J. 
Biol. Chem. 276, 46639-46646 (2001). 

66. LeBlanc, H., Lawrence, D.A., Varfolomeev, E., Totpal, K., Morlan, J., Schow, P., 
Fong, S., Schwall, R., Sinicropi, D., and Ashkenazi. A T umor cell resistance to 
death receptor induced apoptosis through mutational inactivation of the 
proapoptotitc Bcl-2 homolog Bax. Nature Med. 8, 274-281 (2002). 

67. Miller, K., Meng, G., Liu, J., Hurst, A., Hsei, V., Wong, W-L., Ekert, R., 
Lawrence, D., Sherwood, S., DeForge, L., Gaudreault., Keller, G., Sliwkowski, 
M., Ashkenazi, A ., and Presta, L. Design, Construction, and analyses of 
multivalent antibodies. J. Immunol. 170, 4854-4861 (2003). 
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68. Varfolomeev, E., Kischkel, F., Martin, F., Wanh, H., Lawrence, D., Olsson, C, 
Tom, L., Erickson, S., French, D., Schow, P., Grewal, I. and Ashkenazi, A. 
Immune system development in APRIL knockout mice. Submitted. 

Review articles: 

1 . Ashkenazi. A., Peralta, E:, Winslow, J., Ramachandran, J., and Capon, D., J , 
Functional role of muscarinic acetylcholine receptor subtype diversity. Cold 
Spring Harbor Symposium on Quantitative Biology. Lin, 263-272 (1988). 

2. Ashkenazi. A ., Peralta, E., Winslow, J., Ramachandran, J., and Capon, D. 
Functional diversity of muscarinic receptor subtypes in cellular signal 
transduction and growth. Trends Pharmacol. Scu Dec Supplement, 12-21 (1989). 

3. Chamow, S., Duliege, A., Ammann, A., Kahn, J., Allen, D., Eichberg, J., Byrn, 
R., Capon, D., Ward, R., and Ashkenazi. A : CD4 immunoadhesins in anti-HTV 
therapy: new developments. Int. J. Cancer Supplement 7, 69-72 (1992). 

4. Ashkenazi, A ., Capon, and D. Ward, R. Immunoadhesins. Int. Rev. Immunol. 10, 
217-225(1993). 

5. Ashkenazi, A ., and Peralta, E. Muscarinic Receptors. In Handbook of Receptors 
and Channels. (S. Peroutka, ed.), CRC Press, Boca Raton, Vol. I, p. 1-27, (1994). 

6. Krantz, S. B., Means, R. T., Jr., Lina, J., Marsters, S. A., and Ashkenazi, A. 
Inhibition of erythroid colony formation in vitro by gamma interferon. In 
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I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins". When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4. In the course of the research conducted by Genentech's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly higher levels than in corresponding normal human cells. To date, we 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protein expression analyses described in paragraph 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 




expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
arid further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
thereon. 
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Genome-wide Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
Non-invasive and Invasive Human Transitional 
Cell Carcinomas* 
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Gain and loss of chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general. The consequences of these changes at both the 
transcription and translation levels is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question in pairs of non-invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based monitor- 
ing of transcript levels (5600 genes), and high resolution 
two-dimensional gel electrophoresis. The results showed 
that there is a gene dosage effect that in some cases 
superimposes on other regulatory mechanisms. This ef- 
fect depended (p < 0.015) on the magnitude of the com- 
parative genomic hybridization change. In general (18 of 
23 cases), chromosomal areas with more than 2-fold gain 
of DNA showed a corresponding increase in mRNA tran- 
scripts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript levels. Be- 
cause most proteins resolved by two-dimensional gels 
are unknown it was only possible to compare mRNA and 
protein alterations in relatively few cases of well focused 
abundant proteins. With few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The implications, as well as limitations, 
of the approach are discussed. Molecular & Cellular 
Proteomics 1:37-45, 2002. 

Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 
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phenomenon at both the transcription and translation levels. 
High throughput array studies of the breast cancer cell line 
BT474 has suggested that there is a correlation between 
DNA copy numbers and gene expression in highly amplified 
areas (2), and studies of individual genes in solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels in the case of c-erb-B2, cyclin d1, 
ems 7, and N-myc (3-5). However, a high cyclin D1 protein 
expression has been observed without simultaneous am- 
plification (4), and a low level of c-myc copy number in- 
crease was observed without concomitant c-myc protein 
overexpression (6). 

In human bladder tumors, karyotyping, fluorescent in situ 
hybridization, and comparative genomic hybridization (CGH) 1 
have revealed chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression. In the 
case of non-invasive pTa transitional cell carcinomas (TCCs), 
this includes loss of chromosome 9 or parts of it, as well as 
loss of Y in males. In minimally invasive pT1 TCCs, the fol- 
lowing alterations have been reported: 2q-, 1 1 p— , 1q+, 
11q13+, 17q+, and 20q+ (7-12). It has been suggested that 
these regions harbor tumor suppressor genes and onco- 
genes; however, the large chromosomal areas involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses and gains very difficult. 

In this investigation we have combined genome-wide tech- 
nology for detecting genomic gains and losses (CGH) with 
gene expression profiling techniques (microarrays and pro- 
teomics) to determine the effect of gene copy number on 
transcript and protein levels in pairs of non-invasive and in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 

Material— Bladder tumor biopsies were sampled after informed 
consent was obtained and after removal of tissue for routine pathol- 
ogy examination. By light microscopy tumors 335 and 532 were 
staged by an experienced pathologist as pTa (superficial papillary), 



1 The abbreviations used are: CGH, comparative genomic hybrid- 
ization; TCC, transitional cell carcinoma; LOH, loss of heterozygosity; 
PA-FABP, psoriasis-associated fatty acid-binding protein; 2D, 
two-dimensional. 
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Fig. 1 . DNA copy number and mRNA expression level. Shown from left to right are chromosome (Chr.), CGH profiles, gene location and 
expression level of specific genes, and overall expression level along the chromosome. A, expression of mRNA in invasive tumor 733 as 
compared with the non-invasive counterpart tumor 335. B, expression of mRNA in invasive tumor 827 compared with the non-invasive 
counterpart tumor 532. The average fluorescent signal ratio between tumor DNA and normal DNA is shown along the length of the chromosome 
{left). The bold curve in the ratio profile represents a mean of four chromosomes and is surrounded by thin curves indicating one standard 
deviation. The central vertical line {broken) indicates a ratio value of 1 (no change), and the vertical lines next to it {dotted) indicate a ratio of 
0.5 {left) and 2.0 {right). In chromosomes where the non-invasive tumor 335 used for comparison showed alterations in DNA content, the ratio 
profile of that chromosome is shown to the right of the invasive tumor profile. The colored bars represents one gene each, identified by the 
running numbers above the bars (the name of the gene can be seen at www, MDLDK/sdata. html). The bars indicate the purported location of 
the gene, and the colors indicate the expression level of the gene in the invasive tumor compared with the non-invasive counterpart; >2-fold 
increase {black), >2-fold decrease (blue), no significant change (orange). The bar to the far right, entitled Expression shows the resulting change 
in expression along the chromosome; the colors indicate that at least half of the genes were up-regulated (black), at least half of the genes 
down-regulated (blue), or more than half of the genes are unchanged (orange). If a gene was absent in one of the samples and present in 
another, it was regarded as more than a 2-fold change. A 2-fold level was chosen as this corresponded to one standard deviation in a double 
determination of -1800 genes. Centromeres and heterochromatic regions were excluded from data analysis. 



grade I and II, respectively, tumors 733 and 827 were staged as pT1 
(invasive into submucosa), 733 was staged as solid, and 827 was 
staged as papillary, both grade III. 

mRNA Preparation— Tissue biopsies, obtained fresh from surgery, 
were embedded immediately in a sodium-guanidinium thiocyanate 
solution and stored at -80 °C. Total RNA was isolated using the 
RNAzol B RNA isolation method (WAK-Chemie Medical GMBH). 
poly(A) + RNA was isolated by an oligo(dT) selection step (Oligotex 
mRNA kit; Qiagen). 

cRNA Preparation — 1 ju-g of mRNA was used as starting material. 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system (Invitrogen) according to the manufac- 
turer's instructions but using an oligo(dT) primer containing a T7 RNA 
polymerase binding site. Labeled cRNA was prepared using the ME- 
GAscrip® in vitro transcription kit (Ambion). Biotin -labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs in the reaction. 
Following the in vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Array Hybridization and Scanning— Array hybridization and scan- 
ning was modified from a previous method (13). 10 /xg of cRNA was 
fragmented at 94 °C for 35 min in buffer containing 40 mM Tris 
acetate, pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridization, 
the fragmented cRNA in a 6x SSPE-T hybridization buffer (1 m NaCI, 
10 mM Tris, pH 7.6, 0.005% Triton), was heated to 95 °C for 5 min, 
subsequently cooled to 40 °C, and loaded onto the Affymetrix probe 
array cartridge. The probe array was then incubated for 16 h at 40 °C 
at constant rotation (60 rpm). The probe array was exposed to 10 
washes in 6x SSPE-T at 25 °C followed by 4 washes in 0.5x SSPE-T 
at 50 °C. The biotinylated cRNA was stained with a streptavidin- 
phycoerythrin conjugate, 10 /xg/ml (Molecular Probes) in 6x SSPE-T 
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Fig. 1— continued 



for 30 min at 25 °C followed by 1 0 washes in 6x SSPE-T at 25 °C. The 
probe arrays were scanned at 560 nm using a confocal laser scanning 
microscope (made for Affymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsatellite Analysis— Microsatellite Analysis was performed as 
described previously (14). Microsatellites were selected by use of 
www.ncbi.nim.nih.gov/genemap98, and primer sequences were ob- 
tained from the genome data base at www.gdb.org. DNA was extracted 
from tumor and blood and amplified by PCR in a volume of 20 /id for 35 
cycles. The amplicons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected in tumor amplicons compared with blood. 

Proteomic Analysis— TCCs were minced into small pieces and 
homogenized in a small glass homogenizer in 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D gel 
electrophoresis has been described in detail elsewhere (15, 16). Gels 
were stained with silver nitrate and/or Coomassie Brilliant Blue. Pro- 
teins were identified by a combination of procedures that included 
microsequencing, mass spectrometry, two-dimensional gel Western 
immunoblotting, and comparison with the master two-dimensional gel 
image of human keratinocyte proteins; see biobase.dk/cgi-bin/celis. 

CGH— Hybridization of differentially labeled tumor and normal DNA 
to normal metaphase chromosomes was performed as described 
previously (1 0). Fluorescein-labeled tumor DNA (200 ng), Texas Red- 



labeled reference DNA (200 ng), and human Cot-1 DNA (20 jag) were 
denatured at 37 °C for 5 min and applied to denatured normal met- 
aphase slides. Hybridization was at 37 °C for 2 days. After washing, 
the slides were counterstained with 0.15 jig/ml 4,6-diamidino-2-phe- 
nylindole in an anti-fade solution. A second hybridization was per- 
formed for all tumor samples using fluorescein-labeled reference DNA 
and Texas Red-labeled tumor DNA (inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and losses. The average 
green:red fluorescence intensity ratio profiles were calculated using 
four images of each chromosome (eight chromosomes total) with 
normalization of the green:red fluorescence intensity ratio for the 
entire metaphase and background correction. Chromosome identifi- 
cation was performed based on 4,6-diamidino-2-phenylindole band- 
ing patterns. Only images showing uniform high intensity fluores- 
cence with minimal background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridization —The CGH analysis 
identified a number of chromosomal gains and losses in the 
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Table I 

Correlation between alterations detected by CGH and by expression monitoring 

Top, CGH used as independent variable (if CGH alteration - what expression ratio was found); bottom, altered expression used as 
independent variable (if expression alteration - what CGH deviation was found). 
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two invasive tumors (stage pT1 , TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa, TCCs 335 and 
532) showed only 9p-, 9q22-q33-, and X-, and 7+, 9q-, 
and Y-, respectively. Both invasive tumors showed changes 
(1q22-24+, 2q14.1-qter- 3q12-q13.3- 6q12~q22-, 
9q34+, 11q12-q13+, 17+, and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Fig. 1A) and 
20q12inTCC 827 (Fig. ^B). 

mRNA Expression in Relation to DNA Copy Number— The 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1,800 genes that yielded a signal on the arrays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level in 
the invasive versus the non-invasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the CGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 

Two sets of calculations were made from the data. For the 
first set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p and 9q, showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels in the two tumor pairs (Fig. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts in both 
TCCs 733 (77%) and 827 (80%) (Table I, top). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, fop). The inabil- 
ity to detect RNA expression changes in these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected in areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no alter- 
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Fig. 2. Correlation between maximum CGH aberration and the ability to detect expression change by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change in ratio between invasive tumors 827 (A) and 733 (♦) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression line to the right in Fig. 1 , which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
to be scored as an expression change. All chromosomal arms in which the CGH ratio plus or minus one standard deviation was outside the 
ratio value of one were included. 



ation in expression. No alteration was detected by CGH in 
most of these areas (TCC 733, 60% and TCC 827, 81 %; see 
Table I, bottom). Because the ability to observe reduced or 
Increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations in the 
regions showing CGH changes against the ability to detect a 
change in mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2). For both tumors TCC 733 (p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2). Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent. 

Microsatellite-based Detection of Minor Areas of Loss- 
es—In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Fig. 1 , TCC 733 
chromosome 1q32, 2p21, and 7q21 and q32, 9q34, and 
1 0q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two microsatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
(Fig. 1v4). As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 11 p showed a normal ratio in the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expression. Two microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24, 11 p1 1 , 12p12.2, 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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Fig. 3. Microsatellite analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25, detected 
(a) by D1S215 close to Hu class I histocompatibility antigen (gene 
number 38 in Fig. 1), (b) by D1S2735 close to cathepsin E (gene 
number 41 in Fig. 1), and (c) at chromosome 2p23 by D2S2251 close 
to general j3-spectrin (gene number 1 1 on Fig. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1 118 
close to mitochondrial 3-oxoacyl-coenzyme A thiotase (gene number 
12 in Fig. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (N), and the lower curves show the 
electropherogram from tumor DNA (7). In all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH (Fig. 3), suggesting that 
transcriptional down-regulation of genes in the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels— 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Fig. 4. Correlation between protein levels as judged by 20- 
PAGE and transcript ratio. For comparison proteins were divided in 
three groups, unaltered in level or up- or down-regulated (horizontal 
axis). The mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene {vertical axis). A, mRNAs that were scored as 
present in both tumors used for the ratio calculation; A, mRNAs that 
were scored as absent in the invasive tumors (along horizontal axis) or 
as absent in non-invasive reference (top of figure). Two different 
scalings were used to exclude scaling as a confounder, TCCs 827 
and 532 (AA) were scaled with background suppression, and TCCs 
733 and 335 (#0) were scaled without suppression. Both compari- 
sons showed highly significant (p < 0.005) differences in mRNA ratios 
between the groups. Proteins shown were as follows: Group A (from 
left), phosphoglucomutase 1 , glutathione transferase class /a number 
4, fatty acid-binding protein homologue, cytokeratin 15, and cyto- 
keratin 13; B (from left), fatty acid-binding protein homologue, 28-kDa 
heat shock protein, cytokeratin 13, and calcyclin; C (from left), a-eno- 
lase, hnRNP B1, 28-kDa heat shock protein, 14-3-3-e, and 
pre-mRNA splicing factor; D, mesothelial keratin K7 (type II); E (from 
top), glutathione S-transferase-77 and mesothelial keratin K7 (type II); 
F(from top and left), adenylyl cyclase-associated protein, E-cadherin, 
keratin 19, calgizzarin, phosphoglycerate mutase, annexin IV, cy- 
toskeletal 7-actin, hnRNP A1, integral membrane protein calnexin 
(IP90), hnRNP H, brain-type clathrin light chain-a, hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ribonucteoprotein A/B, 
translationally controlled tumor protein, liver glyceraldehyde-3-phos- 
phate dehydrogenase, keratin 8, aldehyde reductase, and Na,K- 
ATPase 0-1 subunit; G, (from top and left), TCP20, calgizzarin, 70- 
kDa heat shock protein, calnexin, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 19,triosephosphate isomerase, hnRNP F, liver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-7r, and keratin 8; H (from left), plasma gelsolin, autoantigen cal- 
reticulin, thioredoxin, and NAD + -dependent 15 hydroxyprostaglandin 
dehydrogenase; / (from top), prolyl 4-hydroxylase /3-subunit, cyto- 
keratin 20, cytokeratin 17, prohibition, and fructose 1,6-biphos- 
phatase; J annexin II; K, annexin IV; L (from top and left), 90-kDa heat 
shock protein, prolyl 4-hydroxylase 0-subunit, a-enolase, GRP 78, 
cyclophilin, and cofilin. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Fig. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fig. 5. Comparison of protein and transcript levels in invasive 
and non-invasive TCCs. The upper part of the figure shows a 2D gel 
{left) and the oligonucleotide array {right) of TCC 532. The red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown below. 
Clearly, cytokeratins 13 and 15 are strongly down -regulated in TCC 
827 {red annotation). The tile on the array containing probes for 
cytokeratin 15 is enlarged below the array (red arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares in each tile 
corresponds to perfect match probes; the lower row corresponds to 
mismatch probes containing a mutation {used for correction for un- 
specific binding). Absence of signal is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present in TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure {left) show levels of PA-FABP and adipocyte- 
FABP in TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript. A medium transcript level was de- 
tected in the case of TCC 335 {1277 units) whereas very low levels 
were detected in TCC 733 (166 units). IEF, isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosomal 
location were detected in TCCs 733 and 335, and of these 19 
correlated (p < 0.005) with the mRNA changes detected using 
the arrays (Fig. 4). For example, PA-FABP was highly ex- 
pressed in the non-invasive TCC 335 but lost in the invasive 
counterpart (TCC 733; see Fig. 5). The smaller number of 
proteins detected in both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

1 1 chromosomal regions where CGH showed aberrations 
that corresponded to the changes in transcript levels also 
showed corresponding changes in the protein level (Table II). 
These regions included genes that encode proteins that are 
found to be frequently altered in bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1 . Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area in invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics, in combination with CGH. In general, the 
results showed that there is a clear individual regulation of the 
mRNA expression of single genes, which in some cases was 
superimposed by a DNA copy number effect. In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Table II 



Proteins whose expression level correlates with both mRNA and gene dose changes 


Protein 


Chromosomal location 


Tumor TCC 


CGH alteration 


Transcript alteration 3 


Protein alteration 


Annexin IE 


1q21 


733 


Gain 


Abs to Pres a 


Increase 


Annexin IV 


2p13 


733 


Gain 


3.9-Fold up 


Increase 


Cytokeratin 17 


17q12-q21 


827 


Gain 


3.8-Fold up 


Increase 


Cytokeratin 20 


17q21.1 


827 


Gain 


5.6-Fold up 


Increase 


(PA-)FABP 


8q21.2 


827 


Loss 


10-Fold down 


Decrease 


FBP1 


9q22 


827 


Gain 


2.3-Fold up 


Increase 


Plasma gelsolin 


9q31 


827 


Gain 


Abs to Pres 


Increase 


Heat shock protein 28 


15q12-q13 


827 


Loss 


2.5-Fold up 


Decrease 


Prohibitin 


17q21 


827/733 


Gain 


3.7-/2.5-Fold up b 


Increase 


Prolyl-4-hydroxyl 


17q25 


827/733 


Gain 


5.7-/1 .6-Fold up 


Increase 


hnRNPBI 


7p15 


827 


Loss 


2.5-Fold down 


Decrease 



a Abs, absent; Pres, present. 

b In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DNA copy number was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-invasive tumor but were present at rela- 
tively high levels in areas with DNA amplifications in the inva- 
sive tumors (e.g. in TCC 733 transcript from cellular ligand of 
annexin II gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transcript from small proline-rich 
protein 1 gene (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an increased expres- 
sion to a certain chromosomal area indicates an increased 
likelihood of gain of chromosomal material in this area. 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so clearly detectable in gained areas. One 
hypothetical explanation may lie in the loss of controlled 
methylation in tumor cells (17-19). Thus, it may be possible 
that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20, 21). A recent report has documented a 
ploidy regulation of gene expression in yeast, but in this case all 
the genes were present in the same ratio (22), a situation that is 
not analogous to that of cancer cells, which show marked 
chromosomal aberrations, as well as gene dosage effects. 

Several CGH studies of bladder cancer have shown that 
some chromosomal aberrations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p-, 9q-, 1q + , Y- 
(2, 6), and in pT1 tumors, 2q-,11p-, 1 1q — , 1q+, 5p+, 8q+, 
17q+, and 20q + (2-4, 6, 7). The pTa tumors studied here 
showed similar aberrations such as 9p- and 9q22-q33- and 
9q- and Y-, respectively. Likewise, the two minimal invasive 
pT1 tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1q22-24 
amplification (seen in both tumors), 1 1q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p-, often 
linked to 20q + and 11 q13+ (both tumors) (7-9). These ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 
bladder cancer. 

Considering that the mapping resolution of CGH is of about 
20 megabases it is only possible to get a crude picture of 
chromosomal instability using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous microsatellites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity in several cases. It seems likely that 
multiple and different events occur along each chromosomal 



arm and that the use of cDNA microarrays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneuploidy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic imprinting has 
an impact on the expression level in normal cells and is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosomal material is not known. 

We regard it as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were biologically 
very close and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available it was possible to apply three different 
state of the art methods. The observed correlation between 
DNA copy number and mRNA expression is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10, 23). 

In the few cases analyzed, mRNA and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translational 
regulation, post-translational processing, protein degrada- 
tion, or a combination of these. Some transcripts belong to 
undertranslated mRNA pools, which are associated with few 
translationaliy inactive ribosomes; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
be very important in the case of polypeptides with a short 
half-life (e.g. signaling proteins). A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D-PAGE (25), and a moderate correla- 
tion was recently reported by Ideker et al. (26) in yeast. 

Interestingly, our study revealed a much better correlation 
between gained chromosomal areas and increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change determined 
the ability to detect a change in transcript. One possible 
explanation could be that by losing one allele the change in 
mRNA level is not so dramatic as compared with gain of 
material, which can be rather unlimited and may lead to a 
severalfoid increase in gene copy number resulting in a much 
higher impact on transcript level. The latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2-fold level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
alterations in transcript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoidentification and/or mass spectrometry to 
correctly identify the proteins in the gels. 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel analysis to obtain information at 
the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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ABSTRACT 

Genetic changes underlie tumor progression and may lead to cancer- 
specific expression of critical genes. Over 1100 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analyze 
the pattern of copy number alterations in cancer, but very few of the genes 
affected arc known. Here, we performed high-resolution CGH analysis on 
cDNA microarrays in breast cancer and directly compared copy number 
and mRNA expression levels of 13,824 genes to qaantitate the impact of 
genomic changes on gene expression. We identified and mapped the 
boundaries of 24 independent amplicons, ranging in size from 0.2 to 12 
Mb. Throughout the genome, both high- and low-level copy number 
changes had a substantial impact on gene expression, with 44% of the 
highly amplified genes showing overexpression and 10,5% of the highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. These 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including the HOXB7 gene, 
the presence of which in a novel amplicon at 17q21.3 was validated In 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA microarrays revealed hundreds of 
novel genes whose overexpression Is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA microarrays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited. . 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 and EGFR (7, 8), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, over 
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Fig. 1. Impact of gene copy number on global gene expression levels. A. percentage of 
over- and undercxprcsscd genes (Y axis) according to copy number ratios (X axis). 
Threshold values used for over- and undcrexpression were >2.184 (global upper 7% of 
the cDNA ratios) and < 0.4826 (global lower 7% of the expression ratios). B, percentage 
of amplified and. deleted genes according, to expression ratios. Threshold values for. 
amplification and deletion were >1.5 and <0.7. 



20 recurrent regions of DNA amplification have been mapped in 
breast cancer by CGH 5 (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remains 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH microarrays to: (a) determine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and (b) identify and characterize those genes whose mRNA expres- 



5 The abbreviations used are: CGH, comparative genomic hybridization; FISH, fluo- 
rescence in situ hybridization; RT-PCR, reverse transcription-PC R. 
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Fig. 2. Genome-wide copy number and expression analysis in the MCF-7 breast cancer cell line. A. chromosomal CGH analysis of MCF-7. The copy number ratio profile (blue 
tine) across the entire genome from Ip telomere to Xq telomere ts shown along with ± I SD (orange lines). The black horizontal tine indicates a ratio or 1 .0; red line, a ratio of 0.8; 
and green line* a ratio of 1 .2. B~C, genome-wide copy number analysis in MCF-7 by CGH on cDNA microarray. The copy number ratios were plotted as a function of the position 
of the cDNA clones along the human genome. In B, individual data points are connected with a line, and a moving median of 10 adjacent clones is shown. Red horizontal line, the 
copy number ratio of 1 .0. In C, individual data points are labeled by color coding according to cDN A expression ratios. The bright red dots indicate the upper 2%, and dark red dots, 
the next 5% of the expression ratios in MCF-7 cells (overexpressed genes); bright green dots indicate the lowest 2%, and dork green dots, the next 5% of the expression ratios 
(underexpressed genes); the rest of the observations are shown with black crosses. The chromosome numbers are shown at the bottom of the figure, and chromosome boundaries are 
indicated with a dashed line. 



sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-20, BT- 
474, HCC1428, Hs578t, MCF7, MDA-361, MDA-436, MDA-453, MDA-468, 
SKBR-3, T-47D, UACC812, ZR-75-1, and ZR-75-30) were obtained from the 
American Type Culture Collection (Manassas, VA). Cells were grown under 
recommended culture conditions. Genomic DNA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Mkro arrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (1 1-13). Of these clones, 244 represented uncharac- 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, 
15). Briefly, 20 y% of genomic DNA from breast cancer cell lines and norma) 
human WBCs were digested for 14-18 h with Altd and Rsal (Life Technol 
ogies, Inc., Rockville, MD) and purified by phenol/chloroform extraction. Six 
jig of digested cell tine DNAs were labeled with Cy3-dUTP (Amersharn 
Pharmacia) and normal DNA with Cy5-dUTP (Amersharn Pharmacia) using 
the Bioprirae Labeling kit (Life Technologies, Inc.). Hybridization ( 1 4, 15) and 
posthybridization washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Human Reference RNA; Stratagene, 
La Jolla, CA) was used in all experiments. Forty u.g of reference RNA were 
labeled with Cy3-dUTP and 3.5 jig of test mRNA with CyS-dUTP, and the 
labeled cDNAs were hybridized on microarrays as described (13,15). For both 
microarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alto, CA) was used to measure the fluorescence intensities at the target 
locations using the DEARRAY software (16), After background subtraction, 
average intensities at each clone in the test hybridization were divided by the 
average intensity of the corresponding clone in the control hybridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
array. Low quality measurements (i.e., copy number data with mean reference 
intensity <100 fluorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and/or with spot size <50 units) 



were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define outpoints for increased/ 
decreased copy number. Genes with CGH ratio >1.43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower $%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled 1 for amplification (ratio, >1.43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 
statistics (1). We calculated a weight, w r for each gene as follows: 

m,, - m^o 



where m gXt & e \ and cr^ denote the means and SDs for the expression 
levels for amplified and nonamplified cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation than the original weight was denoted by a. A 
low a (<0.0S) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and Amplicon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigenc cluster using the 
Unigene Build 141. 6 A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GoldenPath database. 7 The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data 
sets. Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell tines or a CGH ratio >2.0 in at least three 
adjacent clones in a single eel! line. The amplicon start and end positions were 



* Internet address: http://re5earch-nhgrijiih.gov/rricro^ 
7 Internet address: www.gcnome.uC4C.edu. 
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Table I Summary of independent ampticons in 14 breast cancer cell tines by 
CGH microarray 

Location 

lpl3 
Jq21 
Iq22 
3pl4 

7pl2.1-7p1U 
7q31 
7q32 

8q21.1i-Sq2l.13 
8q21.3 

8q23.3-*q24.l4 
8q24.22 
9pl3 

I3a22-q3l 
16q22 
17qll 

I7ql2-q2l.2 
17q21J2-q2l.33 
I7q22-q23.3 
I7q23.3-q24.3 
19q13 
20ql1.22 
20ql3J2 
20ql3.l2-ql3.13 
20ql3.2-ql3.32 



Start (Mb) 


End (Mb) 


dlZC (MDJ 


132.79 


132.94 


'0.2 


173.92 


177.25 


3.3 


179.28 


179.57 


0.3 


71.94 


74.66 


2.7 


55.62 


60.95 


5.3 


125.73 


130.96 


5.2 


140.01 


140.68 


0.7 


86.45 


92.46 


6.0 


98.45 


103.05 


4.6 


129.88 


14115 


I2J 


151.21 


152.16 


1.0 


38.65 


39.25 


0.6 


77.15 


8138 


A2 


86.70 


87.62 


0.9 


29.30 


30.85 


1.6 


39.79 


42.80 


3.0 


52.47 


55.80 


33 


63.81 


69.70 


5.9 


69.93 


74.99 


5.1 


40.63 


41.40 


0.8 


34.59 


35.85 


1.3 


44.00 


45.62 


1.6 


46.45 


49.43 


3.0 


51.32 


59.12 


7.8 



CGH were validated, with lq21, 17ql2-q21.2, 17q22-q23, 20ql3.1, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amplicons were precisely delineated. In ad- 
dition, novel amplicons were identified at 9)513 (38.65-39.25 Mb),* 
and 17q21.3 (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lp!3, 17q22-q23, and 20ql3 were highly overex- 
pressed. A view of chromosome 7 in the MDA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl I— p!2 (Fig. 3A). In BT-474, the two known amplicons 
at 17ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 3£). In addition, several genes, including the 
homeobox genes HOXB2 and HOXB7, were highly amplified in a 
previously undescribed independent amplicon at 17q21.3. HOXB7 
was systematically amplified (as validated by FISH, Fig. 3B, inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 cells. Furthermore, this novel 



extended to include neighboring nonamplified clones (ratio, <1.5). The am- 
plicon size determination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell tines was done as 
described (17). Bacterial artificial chromosome clone RP11-361K8 was la- 
beled with SpectrumOrange (Vysis, Downers Grove, IL), and Spectrum- 
Orange-labeled probe for EGFR was obtained from Vysis, SpectnimGreen- 
labeled chromosome 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(18). The use of these specimens was approved by the Ethics Committee of the 
University of Basel and by the N1H. Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 10% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the tog-rank test. 

RT-PCR. The HOXB7 expression level was determined relative to 
GAPDH. Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promega Corp., Madison, Wl) with 10 ng of mRNA 
as a template. HOXB7 primers were 5 AGC AG AGGG ACTCGG ACTT-3 ' 
and 5 '-GCGTCAGGTAGCG ATTGT AG-3'. 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824_ 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH microarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (/.e\, belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. I A). Conversely, 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig. \B). Low-level copy number 
increases and decreases were also associated with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A). The average spacing of clones throughout the genome 
was 267 Id?. This high-resolution mapping identified 24 independent 
breast cancer amplicons, spanning from 0.2 to 12 Mb of DNA (Table 
1). Several amplification sites detected previously by chromosomal 
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Fig. 3. Annotation of gene expression data on CGH microarray profile*. A, genes in the 
7pl 1 -pi 2 amplicon in the MDA-468 cell tine are highly expressed (red dots) and include 
the EGFR oncogene. B, several genes in the 17ql 2, 17q21.3. and I7q23 amplicons in the 
BT-474 breast cancer cell line ire highly overexpressed {red) and include the HOXB7 
gene. The data labels and color coding arc as indicated for Fig. 2C. Insets show 
chromosomal CGH profiles for the corresponding chromosomes and validation of the 
increased copy number by interphase FISH using EGFR (red) and chromosome 7 
centromere probe (green) to MDA-468 (A) and HOXB7-speafic probe {red) and chro- 
mosome 17 centromere (green) to BT-474 cells (B). 
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Ftg. 4. List of SO genes with a statistically 
significant correlation (or value <0.05) between 
gene copy number and gene expression. Nome, 
chromosomal location, and the o value for each 
gene are indicated. The genes have been ordered 
according to their position in the genome. The color 
maps on the right illustrate the copy number and 
expression ratio patterns in the 14 cell lines. The 
key to the color code is shown at the bottom of the 
graph. Gray squares, missing values. The complete 
list of 270 genes is shown in supplemental Ftg. B. 
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amplification was validated to be present in J 0.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P ~ 0.001). 

Statistical Identification and Characterization of 270 Highly 
Expressed Genes in Araplicons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig, B). Accord- 
ing to the gene ontology data, 8 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated functional information available. Of these, 151 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that could not be directly linked with cancer. 



DISCUSSION 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in >1000 publications applying CGH 9 (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15, 19-21). Here, we applied genome- 
wide cDNA microarrays to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The overall impact of copy number on gene expression patterns was 
substantia] with the most dramatic effects seen in the case of high- 



* Internet address: http://www-.geneonlology.org/. 



4 Internet address: http://www.ncbKtiJm.mh.gov/cntrcz. 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes in the 
regions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
high-level amplifications. Aneuploidy and low-level gains and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 
model system (22-24). 

The CGJH microarray analysis identified 24 independent breast 
cancer amp] icons. We defined the precise boundaries for many am- 
plicons detected previously by chromosomal CGH (9, 10, 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously, presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel amplicons 
involved the homeobox gene region at 17q21.3 and led to the over- 
expression of the HOXB7 and HOXB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
turaorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing HOXB7 in breast cancer and suggest that 
HOXB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 
of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing -2% of all genes on the array), including not only 
previously described amplified genes, such as HER-2, MYC t 
EGFjR, ribosomal protein s6 kinase, and AJB3> but also numerous 
novel genes such as NRAS-related gene (lpl3), syndecan-2 (8q22), 
and bom morphogenic protein (20ql3.1), whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we demonstrate application of cDNA microarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
prominent global influence of copy number changes on gene 
expression levels; (b) a high-resolution map of 24 independent 
amplicons in breast cancer, and (c) identification of a set of 270 
genes, the overexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
17q21.3 implicated amplification and overexpression of the 
HOXB7 gene in breast cancer, including a clinical association 



between HOX87 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and ' 
validate putative targets for therapy development. 
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Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
ization (array CGH) analysis of DNA copy number variation in 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6,691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplicon boundaries and 
the quantitative analysis of amplicon shape provide significant 
improvement in the localization of candidate oncogenes. Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to which variation in gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number is associated with a corre- 
sponding 1 .5-fold change in mRNA levels, and that overall, at least 
12% of ail the variation in gene expression among the breast 
tumors is directly attributable to underlying variation in gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
[e.g., FGFR1 (8pll), MYC (8q24) ( CCND1 (llq!3), ERBB2 
(17ql2), and ZNF217 (20ql3)] and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g., gain of lq, 8q22, and 17q22-24, and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes in breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profile DNA copy number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within involved 
regions. Because we had measured mRNA levels in parallel in 
the same samples (8), using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 
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this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell Lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of individual tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et al (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of all reagents were adjusted accordingly. 'Test" DNA 
(from tumors and cell lines) was f hiorescently labeled (Cy5) and 
hybridized to a human cDNA microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
mal female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePix scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using scanalyze 
software (available at http://rana.Ibl.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimating 
Significance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric 5-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned by 
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Fig. 1. Genome-wide measurement of DNA copy number alteration by array CGH. (a) DNA copy number profiles are Illustrated for cell tines containing different 
numbers of X chromosomes* for breast cancer cell lines, and for breast tumors. Each row represents a different ceil I be or tumor, and each column represents 
one of 6,69 1 different mapped human genes present on the mtcroarray. ordered by genome map position from 1 pter through Xqter. Moving average (symmetric 
5-nearest neighbors) fluorescence ratios (test/reference) are depicted using a logrbased pseudocolor scale (indicated), such that red luminescence reflects 
fold -amplification, green luminescence reflects fold-deletion, and black indicates no change (gray indicates poorly measured data), (b) Enlarged view of DNA 
copy number profiles across the X chromosome, shown for cell lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UniGene 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc.edu/; Oct 7, 2000 Freeze). For UniGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UniGene- 
cluster) are reported. For mRNA measurements, fluorescence 
ratios are "mean-centered" (i.e., reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cONA microarrays containing 6,691 different mapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path" (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within amplicons), but also provide precise 
genetic landmarks for chromosomal regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. lb), as we did before 
(7), demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO), and 13- (47,XXX), 2- (48.XXXX), or 
23-fold (49,XXXXX) gains (also see Fig. 5, which is published 
-as supporting information on the PNAS web site). Fluorescence^ 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within Iq, 8q, 17q, and 2Qq were 
observed in a high proportion of breast cancer cell lines/tumors 
(90%/69%, 100%/47%, 100%/60%, and 90%/44%, respective- 
ly), as were losses within lp, 3p, 8p, and 13q (80%/24%, 
80%/22% t 80%/22%, and 70%/18%, respectively), consistent 
with published cytogenetic studies (refs. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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Fig. 2. DNA copy number alteration across chromosome 8 by array CGH. (a) DNA copy number profiles are illustrated for cell lines containing different numbers 
of X chromosomes, for breast cancer cell lines, and for breast tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering to 
highlight recurrent copy number changes. The 241 genes present on the microarrays and mapping to chromosome 8 are ordered by position along the 
chromosome. Fluorescence ratios (test/reference) are depicted by a log; pseudocolor scale (indicated). Selected genes are Indicated with color-coded text (red. 
increased; green, decreased; blade, no change; gray, not well measured) to reflect correspondingly altered mRNA levels (observed in the majority of the subset 
of samples displaying the DNA copy number change). The map positions for genes of interest that are not represented on the microarray are indicated in the 
row above those genes represented on the array. (5) Graphical display of DNA copy number profile for breast cancer cell line SKBR3. Fluorescence ratios 
(tumor/normal) are plotted on a log? scale for chromosome 8 genes, ordered along the chromosome. 



number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade (P = 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative (P = 0.04), and harboring TP53 mutations (P « 
0.0006) (see Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a). The complexity of amplicon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 26). For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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Fig, 3. Concordance between DMA copy number and gene expression across chromosome 1 7. DNA copy number alteration (Uppei) and mRNA levels (tower) 
are illustrated for breast cancer cell lines and tumors. Breast cancer cell lines and tumors are separately ordered by hierarchical clustering (Upper), and the 
Identical sample order is maintained (lower). The 354 genes present on the microarrays and mapping to chromosome 1 7. and for which both DNA copy number 
and mRNA levels were determined, are ordered by position along the chromosome; selected genes are indicated in color-coded text (see Fig. 2 legend). 
Fluorescence ratios (test/reference) are depicted by separate iogj pseudocolor scales (Indicated). 



of DNA copy number and. mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.e., a significant 
fraction of highly amplified genes appear to be correspondingly 
highly expressed; The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
, ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 



breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
t tests comparing adjacent classes: cell lines, 4 x 10 ~ 49 > 1 x 10" 49 , 
5 x-UTVl- X 10^ tumors, 1 x 10^,-1 x 10' 214 , 5 X 10" 41 , 
1 X 10" 4 ). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1,4- and 1.5-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a 9 regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumorsamples (Fig. 4b). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of ail variation measured in mRNA levels among the 37 
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Fig. 4. Genome-wide influence of DNA copy number alterations on mRNA levels, (a) For breast cancer cell lines <gray) and tumor samples (black), both 
mean-centered mRNA fluorescence ratio (log 2 scale) quartiles (box plots indicate 25th, 50th, and 75th percentile) and averages (diamonds; y-value error ban 
indicate standard errors of the mean) are plotted for each of five classes of genes, representing DNA deletion (tumor/normal ratio < 0.8), no change (0.8-1 2), 
low- (1.2-2). medium- (2-4). and high-level (>4) amplification. P values for pair-wise Student's t tests, comparing averages between adjacent classes (moving 
leftto right), are4x 10** 49 . 1 x 10-«5x 10" s . 1 x 10" 1 (cell lines), and 1 x 10" 43 , 1 x 10- 2U ,5 x 10" 4 \1 x 10**ftumors). (6) Distribution of correlations between 
DNA copy number and mRNA levels, for 6,095 different human genes across 37 breast tumor samples, (c) Plot of observed versus expected correlation coefficients. 
The expected values were obtained by randomization of the sample labels in the DNA copy number data set The line of unity is indicated, (d) Percent variance 
in gene expression (among tumors) directly explained by variation in gene copy number. Percent variance explained (black line) and fraction of data retained 
(gray line) are plotted for different fluorescence intensity /background (a rough surrogate for signal/noise) cutoff values. Fraction of data retained is relative 
to the 1.2 intensity/background cutoff. Details of the linear regression model used to estimate the fraction of variation in gene expression attributable to 
underlying DNA copy number alteration can be found in the supporting Information (see Estimating the Fraction of Variation in Gene Expression Attributable 
to Underlying DNA Copy Number Alteration), 



tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overall, about 
1% of all of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. Ad). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. 4d). This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not only by true variation in 
the expression programs of the tumor ceils themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 



cell lines and tumors. Although the DNAmicroarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable 
(but would nevertheless still be remaricable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneupioidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips ct aL (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et al. (15) recently reported that in metastatic 
colon tumors only —4% of genes within amplified regions were 
found more highly (>2-fbld) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Platzer et al (15) may have 
systematically under-measured gene expression changes. In this 
regard it is remarkable that only 14 transcripts of many thousand 
residing within unamplified chromosomal regions were found to 
exhibit at least 4-fold altered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimiting the bound- 
aries of high-complexity amplicons, effectively overcalling re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. 

Our finding that widespread DNA copy number alteration has 
a large, pervasive and direct effect on global gene expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes are not subject to specific autoregulation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene cannot alone be considered strong 
independent evidence of a candidate oncogene's role in tumor- 
igenesis. In our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. This 
highlights the importance of high-resolution mapping of ampli- 
con boundaries and shape [to identity the "driving" gene(s) 
within amplicons (16)], on a large number of samples, in addition 
to functional studies. Fourth, this finding suggests that analyzing 
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WISP genes are members of the connective tissue growth factor 
family that are up-regulated in Wnt-l-traneformed cells and 
aberrantly expressed in human colon tumors 
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ABSTRACT W D t family members ore critical bo many 
developmental processes, and components of tbc Wot sicnal* 
in* pathway have been linked lo tuntoriccnesEs in familial and 
sporadic colon carcinomas. Here we rtyort the identification 
ofcwofctncj, WISP'2 and W7SP-2, that arc up-rccuUted in the 
mouse mammary epithelial cell Uoe C37MG transformed by 
Wnt-l, but not by WnM. Tofictdcr with a tDlrd related cone, 
BXJP.J/rtiese proteins fieflne a suVUmHy ot the connective 
tissue growrq factor family. Two distinct systems aemon- 
s (rated wtsp Induction to be associated with the expraiioo or 
WnM. These included (i) CS7Mfi cell* iofecred with a Wnt-X 
rtlrovirsl vector or expreftilng Wnt-l under the control of » 
Utrscyline represable promoter, and (a) Wnt-l transgenic 
mice. The WHIP J genu w« Idealized to human chromosome 
bV|24 k U9q2d J. W7SP-1 genomic DNA was amplified in colon 
caoeor cell lincc and in human colon rumar9 and ita UNA 
ovcrexprostad (2- to > 30-fold) in 14% orthi rumore examined 
eumparcd with paticnl*mntchcd normal mucosa. WfSP-3 
mapped to chromosome 6q22-6q23 and also nas orercx* 
pressed (4- to > 40* fold) in 639* of the colon tumors analysed. 
In contrast, WISP'2 moppvd to human chromosome 20ul2- 
20q13 ond it) DNA *as amplified, but RNA expression was 
reduced (2- to > 30- fold) Id 79% of the tumors. TDese results 
suggest that the WIST genes may De downstream or Wnt-l 
.ilenailne and that anerronr lems or express ion tn colon 
cancer may play a role in colon rumor igencsis. 



Wnt-l is a member of en expanding family of cysreine-rich, 
glycosylated signaling proteins that modiato ctvorac develop- 
mental processes such as the central of cell proliferation, 
adhesion, cell polarity, and tbc establishment of celt feces (1, 
2). Wnt-l originally was identified as an oncogene activated by 
the insertion of muusc mammary tumor virus in virus* induced 
mammary frdenocardnomas (3. 4). Although Wnt-l is not 
expressed in The normal mammary gland, expression of Wm-1 
in transgenic mice causes mammary tumors (5). 

In mammalian celts, Wnt tamJty members initiate signaling 
by bindine f o the seven-rransmembrane spanning FiiMled 
receptors and recruiting tbc cytoplasmic protein IHshevailaa 
(Dsh^ to the ceu membrane (1, 2, 6;. Dsn then inhibiti the 
kinase activity 0/ we normally constitucivdy active glycogen 
qmthaie kinite.3jy (OSK*J0) rorulcing in an ircreace in 
beaten In leveiv. S'tabiJizcd /3 -eaten in intaracu with the tran- 
scrtptlon tAcior TOVLef 1, forming a complex thai appoar* in 



TT.c pueiicition coin of this art'dc were defrayed (n part 17 pjye cHrge 
paymenT. Trtls arrlelc must ffteretorc oe nereoy mortec "iJvi/twrh£.'ir in 
ftceorUnncc ^!th U U.S.C. 3173+ wtety to tndlwie this fact 
C 1903 by Tne Naiidnul AeaJinty of Soi # no« 0037-9^24 /OS/ftS I -7X7-i£2, 00/0 
PNaS u jviiiyoie on<inc ar *wav (jiiai.ori. 



the nucleus and binds TCF/Lcfl target DNA elements CO 
activate transcription (7, 6). Other experiments suggest that 
the adenomatous polyposis eoli (APC) tumor suppressor gene 
olao plays an imponant role in Wm signaling by rcgulaiiitg 
^-cattnin leweb (9). APC is phosphorylutcd by GSK-3ft binds 
to 0-catenin, and facDitetes iu depredation. Mutatiuns in 
either AFC or p-oaUnin have been tissodatcd with colon 
careinomaa and melanomas, suggesting these mutations con- 
tribute to the dcvfJopmentof theso type;, of cancer, implicating 
the Wnt pathway in tumorij;cacaie CI ). 

Although much hao been learned about the Wnt signaling 
pathway over the peat several years, only a few of the tran- 
scriptionally activated do^natroom components activated by 
Wm have been chaxacterrad. Those that have been described 
cannot account for All of the diverse functions attributed to 
Wnt cignalino. Among the candidate Wnt target gene* are 
tnose encoding the nodai-roiated 3 gene, Xnr3, a member of 
the transforming growth tactor (TC> mpcTfaznity, and the 
oomeobOJt genes, engrailed, goose ca id, n+m (Xru/n)> indtiamoit 
(2). A recent report aUo mentiftes c-fry-c as a target gene of the 
Wm signaling patnway (1U). 

To lOemify additional downstream genes In the wnt signal- 
ing Dft f -h w ay rhat are relevant to the crins 'Armed cell pheno. 
type, we used a PCK-based cDNA subtraction strategy, sup- 
pression subtractlve hybridizacioi] (S5H) (W), using hna 
isolated from C17MO mouse mammary epimeila) cells end 
CJ7MO cells .uably rraasformcd by a Wm-.t rcrrovlrus. Over- 
expression of Win- 1 in dm cell line is KufHdent cti induce a 
partiaJIy transformed phenorype, chaj'3CTcrt*cd by elongated 
and rcfractilc cells that loae contact inhibition and form a 
mulciltyered array (12, 13), Wc rciisoncd that genes differen- 
tially expressed between these two cell tines might Loraribute 
to the transformed phenorype. 

In this paper wc describe the clonini: and characterization 
of two genes up-regulated in Wnt-l transformed cells, WISP- 1 
and WrSP-2, fine a third related eencs WISP J. The WISP genes 
are mom b era of the CCN camUy of rmwih footers, which 
included connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH w W performed bv using the PCR-Salect cDNA 
Subtraction Kit (ClONTECH). TdMer doublo-slrondcd 

Aimevlatlons: TCF. tranttornang growth ^tcir. CTCF, connecnw 
eiauc tt\>*\\\ ractcr; SSti. sudp region iubtr active nyortaiatton; 
VWC, von WilUbrand foeio/ typo C motlul*. 
0»a depocition: Th* c«qu«Acu r«port»d m ihU pop©/ ktvo b«an 
deposited in the Henbtink dirubiwe (accession nos. p&\wm 
Ari0O77^ AF100779, Af 100780 and AFl007bl). 
To «hom ra print re^ueats should be iJdrcsfoid. e-mailt diano^lgertc, 
com. 
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eDNA wax synthesized from 2 u$ of poWa)" RNA isolated 
from the OVWG/Wnl-! ceU line and drWer cDNA from 2 >*g 
of poly(A)+ RNA from the parent C37MO cells. The sub- 
tracted eDNA library was subctoned (MO a pOEM-T vector lor 
further analysis. 

cDNA Library Serening. Clones encoding full-length 
mouse WISP-] w *re isolated by screening a AgUO mouse 
embryo cDNA library (CLONTECH) with a 7i i-bp probe from 
the original partial done 568 saqucnee correspond tag to wwno 
acids 128-169, Clones onooding fulUength human WISP-1 
were isolated by screening AgtlO lung and fcUl kidney cDNA 
libraries with the same probe- at low stringency. Clones en- 
. coding full-length mouse and human WI&PJ were isolated by 
screening a C57MG /Wnt-1 or hum** fetal 1^8/DNA ^library 
with a probe correspondine to nucleotides I4w-i5ii. ruu- 
length cnNAs encoding WISPS were cloned from human 
bone marrow and fetal kidney libraries. # 

Expression ol Human RNA. PCR amplification of 
first-strand cUtf A was performed with human Multiple Tissue 
cDNA panel* (CLUr/lKCH) and 300 uM of etch dNTP at 
94*C for I «c, 62°C for 3U sec, 72* C for 1 min, for 22-32 cycles, 
WISP and gtyeeratdehyda-i-phcsphaie dehydrogenase pumer 
aoquencea are available on request. 

In Sir* Hybridisation. *>P-labelcd sense and antisensc ribo- 
p robes were transcribed from an K97.bp eO< produce corre- 
sponding us nucleotides oOl-l<UO at mouse »T.TM or a 
294-bp PCR product corresponding to nucleotides 62-.V71 of 
mouse WtSP-2. All tissues were processed m described 

Radiation Hybrid Mapping Genomic DMA from each 
hyorld In the Stanford C3 and Ganebridge* Radiation Hybnd 
Panels (Research Ocnctica, Huntavillc, AL) and human and 
hamster control DWAS were fCR-emplified, and the results 
were submitted 10 tne Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Li***, Tumors, and Mucosa Specimens. Tissue speci- 
mens wete cccaincd from the Department of Pathology (Uni- 
versity of Pittsburgh) lor patients undergoing colon resection 
and from the University of Leeds. United Kingdom. Genomic 
DMA woe isolated (Uiagen) from me pooled blood of 10 
norma! human donors, surgical specimens, and the faUowme 
ATCC human call lines: SWdgO, COLO WDM. HT-29, 
WiDr t and SW403 (colon adenocarcinomas). 5W620 (lymph 
node molaetaws, colon adenocarcinoma).. HfT U6 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a vaviani of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using HoecbSl 
Oye33r>8 intercalation ftuorim&try: Total RNA wis prepared 
by domoKenization in 7 M GuSCN Mowed by cenumigation 
over CsO cushions or prepared by using Rl^Aaol. 

t iae Amplification and RNA Expression Analysis, Relative 
g^rtfi smpliGcatlon and RNA expression of Witt and cmyc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR Ceno-specifx primers ana 
fluorogtnic p rotes (sequences available on request) ware 
designed and usee* to amplify and quantitatc ihc ftenca. The 
relative ganc copy numoer was derived by using the formula 
2tAd) where ACt reprejants the flffctencc in amplification 
cycles roquired to detect tne WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with norm2i mucosal RNA. The 

0- mcihod vvoo uoed for calculation oi the SE ol 'he gene enpy 
numbcrorRNA e*praasion level. The WMMpECifiC signal was 
normalized to thai of ins glyceraldehyde-3-phoSpnatc dehy- 
drogenase housekeeping gene. All TaqMan Eissay rcftEeMS 
>vcic obitjnod from Pcrkin-Elmer Applied Biopyscems. 

RESULTS 

Isolation or mSP-i and WJSP-2 by SSM. To identify Wnl- 

1- lnd\icible genes, wc used the technique of SSH using the 
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mouse mammary epUhcliai cell line C57MO and CS7MG eclb 
that stably express wnM (It). QinOidatc ditfercntiaUy ex- 
pressod cDWAs (1,33^ total) *ere sequenced. Thirty-nine 
percent of the sequencos matched known gunes or homo- 
logies. 32% matclied depressed so^uenoc ugs, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse trnnacriprion-PCR and 
Northern analysis were performed by using wRNA from the 
CS7MG and C57MCj/vvm-t celts. 

Two of the oDNAj, IWM and WlSP-\ were differeniialiy 
exprcssod, being ihduoed in the O^MfJ/WnM ttW line, hut 
not in the parent C57MG celU cr C57MO talk overeacprettlng 
(Fig- 1 A and-fi). Wnt-4, unliko Wm-l, doe* not induce 
the morphological transformation of C57MG celU and has no 
effect on Jtcatenln levels (13, 14). Bcprcaason of W1SP-) waa 
up-iaguiated approximately 3-fold in the CSTMG/Wnt-l ceU 
line and WISPA by approKlmatery 1-fald by both Nortbcrti 
analysis and revorse transcriptioru.PCB . 

An independent, but similar, system was used 10 camilK 
WISP expression after Wnt-1 induction. C57MG cells exprtss> 
ln£ the Wnul ecne under the ccrwol oi a leiraeycune- 
reprcsslblc promoter produce low amounts of Wnt-1 in the 
rcpresseO State but Show a Strong Induction of Wnt-1 mRNA 
and protein within 24 hr after reiraeyciinc removal (8). The 
levoU of Wnt-1 an4 wtSP RNA Isolated from these cells at 
vohoua times after tetracycline removal were assessed by 
quantitative PCR. Strong induction cfWni-1 mRNA was seen 
u early as 10 hr after tetracycline removal. Induction or* HW 
mJRN A (2- to 6-fold) was seen at 4ft and 72 hr (data not shown). 
I hese data support our previous observations that show that 
V/ISP induction Is correlated with Wnt-1 expression. Beeauae 
the induction is kIow, occurring after approximaicly 43 hr, the 
induction of WISP* may be an indirect response to Woi-1 
signaling. . . 

cDNA clones of human WISP-1 were isolateci and the 
sequence compared with mouse WISP- /.The cDN A sequences 
of mouse and human MSP-1 were 1,766 tnd 2,S30 bp in length, 
respectively. anO encode proteins of 367 an, with prcdicte<l 
relative molecular masses of -40,000 (M T 40 K). Both have 
hydrophobic N-termin&l signal scQucutcs, 38 conserved cys- 
teine residues, and four potential W-HnVccd WW^™ sites 
and or© 84^ idonucal (Fig. 14). 

Pull-length cDNA clones of mouse and human W7SM were 
V.734 and 1.293 bp in lerutch, rcjpcccivaly. «nc encode proteuif 
ct 251 anti 7.10 aa, respectively, with pwUctod relative molec- 
ular masses of ~?7.0nn (Af, 27 tC) (Fie 2ii). Mouse and human 
WISP-2 ure >3% identical. Kumiin WISP-2 has no potencihl 
N-lmked glycosylation siLeS, and rnOUSC W[3P-2 has one tX 




Fie. 1. wjiF-i aari WJ*-2areindue.;dUYWnfl,eutnoiWnt.4, 
expression Ifl eslli. Ho:thcr» walysi; of W »^ 

mSF'Z (B1 cspteSJiea in C57MC, CSTMCS/Wnkl. and (57M<./ 

cdL. Poly(A) + RNa (2 *»» tuhjcctatf to Norhern Did 
cn 0 )y 3 « and hybndiaad with a 70-op mouhd W-/-speafi« U'oUc 
(ammo acidi 37S JUU) or S 1<HU>D W1SF-2-WC*fo H«bc (aucleotulot 
U2g-tfi27) in the Y untraftilated tcglon. Blow were rdiybrtdu»d w-th 
human fi-tctin prebc. 
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W1SP-1 
WlSM 



W1SP-3 



-r~^ir;::^?HfD^^ 

TlC . r Encoded amino add f^^/,^ 
hMman HKW W j Mid «0USC and human IM8M [Byw » 

vwcllvroiabuwvindlo (TSP), »nd C-iermmal (CD aoraims ire 
underlined. 

position 197. WISP-2 b*c 28 cysteine residuw that tr* cod- 
Tcrvcd «non* the 33 cysteines faond In 

IdcntiKcotioD of TOSW. To search ior related ptOKinv wc 
screened cHpr«Kd 9 eq^nc6 tag (EST; tfitabastt *] th the 
**P-l protein sequence and idencAd several OT » 
potentially related sequences. W. identified a ^mo^ogoui 
&«ta tm« *c Have called W1SP-3. A full-length tnnin 
WISPS cDNA of 1,37). op was isolated conespondtng to thai £ 
ESTb that encode » 334-aa protein with a predicted molecular 
rotua of 39,293. WISP-3 has two potential N-linkcd ^lycooyl- 
*tion sil* atid 36 cysteine residue*. An alignment erf Ow-thrrt 
numan WISP proteins .hows that W1SM «K| |«SP-3 arc he 
m 0SlSlmilar(42% identity), where* ^^^ 7 ^f ntlty 
with and 32% identitt «ith WKM CJlg. M). 

*/5J>r Are Homolcgous to the CTCP Family of Proteins. 
Human WISP-I. W.V-2. and WSP-3 f.rt novel sequences 
however, mouse WW is MC name to the recently lOOftOliefl 
£hrJ aend. Etml is expressed In low, but not high, myostatic 
mou* malanoma cells, and suppresses the in vivo grow* and 
metastatic potential of K-V/35 mouse melanoma eolb (151. 
Human oncl mouec WiSP-2 *<t homologous to ihe recently 
described rat gene, rC^l (16). f^^J^ 1 ^^ 
<u%) was accn 10 tho CCN family of growth iactors. This family 
includes Chrec members, CTGF, Cyr6l, and the protoonco- 
fi enc nov. CTGF Is a chcmotacuc And mitogenic facta for 
fibroblasts that Is implicated in wound healing *nd Iibrouc 
disorder* snO ts induced Oy TOF-fl (17). Cyt61 is «n ejctrscfil- 
hilar matrix signaling molecule thai DicmotoJ call ^hesion. 
proliferation, migration, angiogenics, and tumor Rfowtn ris, 
19). nov (nephroblastoma overexposed) is. an immediate 
evly ^enc uaoeiatad *i:H quince Md found altered in 
Wilina tumors (20). Tho proteins of the CCN famib f ahnrc 
fUnccional. but not 3t»qucn«, flimOanty to Wm-V All are 
secreted. Ly 5 [cinc^ich hoporin binding glycoproteins that as- 
sociate with die cell aurfaco and wtraccllular malnx. 

WlbP proteins exMbU the modular eirchucccura ok the CCN 
family, characteftisd by four conserved cyslaiftcnch donwifl* 
(Fig. 35) (211. 'lhe N-ic-rminal domain, which includes ma nrit 
12 cysicino residues, contains a consensus sequence 
CXXC) consoled in mou Lnmlin-Ukc £roNvth faccor (IGP)- 
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Fta. ^ M> Encuded imino »cid d»qu«i« Alignment of human 
m 'TtemanTi-Uu* of WISM ami WKi».2 nu are noi 

«si*wiNirti«l Ki«).Th*t<wrca«»ine rrvulua *;^^dom».n 
Art«? ita—t in WISP-l arc mdicaud wi* > (Q 
W£PoIlNA in Human itssua- TCR «•»» p-temod oa bum«o 

binding pro'eim (BP). Thi« cquonca U conserved in WISM 
ittA WISP-3, wWerciS ^KP-l h» » glutsmlne in lhe ihird 
posldon InHcsd of . glyein.. CTCF reornuy to been, shown 
io speclQcally bind ICf (22) and a trun^tid nov proton 
lacking (He 1GF-BP dotnain u oncog»MC (13V The von Wil- 
lebr in d fkeior type C module (VWCJL •!•<• found » ««u.b 
eoUigeiu wd mucins, covets :he nc« 10 mwiat rondoM, »r* 
i, iboughl ». partieipaie in protein e£ ""P , «i? T '* a ' ,0 ?i 
oliRoml^tioS (W). ■«« VWC donui-n of WISP-3 differs 
from all CCN family memoerj descrlDeU pievwwly. »n that .t 
cootaim only sw of ths 10 cysteine rcsiaucs (Fie. 3 Awi O). 
A ho™ vttiible ration follow, tht vw<.: doma.n The th.rd 
module. tHe ihrombowoodin (TSP1 domain « to 
bmdiftS to {Ulfitcel clYuoeonfagsK* and eontaiD» sn ey»l«va« 
and a conserved WS.C&o.CC mojif fl«« id«tAed in 
thcomoo^pondin 01). The C-tenninal (CT) *^^»»J'»; 
ing ihe remaining 10 cysteine* i« ilioi»»hi to be involved m 
dl.,i»tion a-d'reeeptor b.ndlne (26). Th. CT dom„„ , 
peeetn« in all CCN family memOecs described to d««« but ». 
absent in WISP-2 (F.g. 3 A and B). The existence ot a putaw 
-iensl lequcnca and th» abtenea of .i trinsmembraoe donuim 
suEjes. thai WlSPa are sacrtted froteins. an obsexvat.o.. 
supported by an aneWiH of iheiv a^p. «.ion and secretion from 
rnaZelfcm cell and baculovi™ culture* (data ^noi shown). 

Expression of WISP nWA in H«™*b T.*«u<». littat. 
.pecific expression of human WISH *a» cbwo««*d by i-u. 
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analysis on aduli and fetal multiple tissue cDNA panda. 
wjm-j expression >as seen In the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fit;. 3C). 
Little ot no expression was detected in the brain, fiver, skeletal 
muscle, colon, peripheral Wood Leukocytes, prostate, testis, or 
thymus. WISP-2 had a more restricted tissue expression find 
was dewctcd in adult skeletal muscle, colon, cvary, and fetal 
lime. Predominant expression of WfSPS wm seen ift adult 
kidney and testis and fetal kidney. Lower levele of WISPS 
expression were detected in ploconU, ©vary, prostate, end 
small intcalinc. ■■■ , 

In Situ Localisation ofWISP-I and W1SP-2. Expceaafion of 
WISPS and W1SP-2 was assessed by in tint hybridisation in 
mammary tumors from Wm-1 transgenic mice. Slrcne expres- 
sion of WfSf~j was observed in stromal fibroblasts bi»K within 
the Hbcovascular tumor stroma (Pig. 4/1-0). However, low- 
level WISPS expression also was observed focaiiy within tumor 
eells (data nol shewn). No expression was observed in normal 
breasu Like OT-J, WISPS expression a'so was seen In the 
rumor stroma in braa&t tumors from Wnl-i transgenic animals 
(Fig. 4 E-hf). However, WISP-2 expression in the stroma was 
in spindle-shaped celia adjacent to capillary vessels, whereas 
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PlO. 4, (/I. C, £, ano 0) Reoresemati^c hcrrutOKylin/coMn-alGincd 
(mages from breast tumor* in Wm-l traiu^oicmice. The correspond- 
ing ifcrk-ficld imajoa showing Wf£P*J oxowdion *ro fhown m S and 
D. Tha mmor is a modor^&Iy Wl-di tier* dialed adeiweftrcinorna 
showing ivid»n« of adenoid cystic change. At low power \A aid H), 
expression of WtSP-l <s iccn >n the delicate tranchine /ibrovascular 
tumor stroma Urrowhcacf). A' higher nuintncatton. expression it seen 
In me scramelfr) flbroblujis (C «iJ £>). wid turner cells mc noe*tivo 
Focal cxwesjion of hc*6w, *-as obaorvod in tumor cell* tn 

jomc arcw. Tmogej or W!$P-l nxpreasion ore sho^n in £JI. Allow 
po^ew (£ and P), oiprwxicn of W1SP-2 is «cr\ in cells lying within The 
fibrovncuUr tumur stroma. At iitgner moeniCtCaUcn. tnese cells 
appearco to be aojaccnt to capillary vessels wttcrcw rumor celli arc 
negative (0 ano //). 
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tbc predominant cell typo bxpreeairg WlSP-1 was tha rtromal 
fibrObUsw. 

Chromojomc Localiwrlon of Ihe WISP Gene*. The chro- 
mosomal location of the human WISP genca wu determined 
by radiation hybrid mapping panclt, k epprcocimatcly 

K&x cR from toe meiotic marker aFM259xc5 (logarithm at 
onns (loci) score i M i ] on chromosome 8q24.i to Sq24.3. in die 
same region as toe human iocuj of the novH Samily member 
(27) and roughly 4 MbJ disut to c~myc [ JA). Preliminary fine 
mapping indicates that WISP-I is Ineatad aear D8S1712 STS. 
. * linked to the markrr SHGC-3.M22 (lod • 1,000) oo 

chromocome : i)qlXJ0ql3.1. Human WISPS mapped tc dwo- 
mojome 6q2a-6q23 and ie linked to tho marlcer AFM21l2e5 
Qod - 1,000), WISP-2 ie approximately 13 Mb« proximal to 
CTGF and 2J Mbe proximal to the human oellular oncogene 
MY0 {27, 29). 

Ampllflcoiion and Aberrant Expression oTWISPi in Human 
Colon Tumors. Amplification of nroiocncofcenea (a aeen in 
many human tumors and has etiological and protnoatlc ai*- 
nmcanoe. for example, in a variety of tumor types, c^nye 
amplification has oeen associarcrl with malignant progression 
and poor prognosis (?0). because WISP-1 resides in the same 
general chromosonml location («q^) as c-myc, we asked 
whether it wax a target ot gene amplification, and. iT SO. 
whether this amplification was indepen dent at th&cwnyc locus. 
Genomic DNA from human colon cancer cell lines was 
asscssod by qutniiutiva PCK and Souihen* blot analysis, (tig. 
SA and £). Both methods detected eimiUr degrees of W1SPA 
amplification. Moet ocl! lin»s showed significant (0- to 4-6old) 
amplification, with the HT-29 and WiDr ceU lines demonstrat- 
ine »ri : 3-/old inorcaac. Signifioontiy, th« pattern of emplifica- 
tiuh observed did net correlate w«h th»l observed fat c-mytf, 
indicating that the c-mvc £enc is not part o£ the amplioon that 
involves die W[$P*1 locna. 

We next examined whether the Wilt cenes were amplified 
In a panel of 23 orimary human colon adenocarcinomas. The 
relative WISP gene copy flUmner In each colnn tumor OKA 
wis compared with pooled normal DNA from 10 demon by 
quantitative FCR (fig. 6). The copy namoer of WSM and 
WIS&2 was significantly greater than one. approximately 
2-fold for WISP-I in about 60% of the tumors and ^ 10 A-ro»0 
for WISP-2 in P2% of the tumors {J> <: U.UU1 lor each), t he 
copy number fox WISPS was indiatinguubable Irom one = 
0il66)..rn addition, the copy number *>f WlSP-'JvtU Signi5- 
concly higher than chat of WISP-I (P < 0.001). 

The le^la of WISP transcript* in RNA isolated from 10 
adenocarcinoma and their matched oormcd mucoea were 
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FiO. 5. Amelificoticr of WISP J g»nomicDMA in colon cinoerccll 
It nil (A) AmpHficfttion tn cell line aNA u*s determined by QuanU* 
tatiwe PCR. (fl) Southern blots conisin(nj genomic DNA (in ul) 
O'gcster w»w KccRJ (HIV-/) or atjoI (c-*n>c) were hybritlircd >*iih 

3 100-bp hunwn WISP'I yrgbc (amino ac.dj 18C-2L9) or o humau 
t-myc probe (located ai 1903-2O00). "Ou MSP wid myc S#nM a/* 
datootod in nornwl hnman ganomie DJ^A atioi i longer film (tfpoture. 
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Tumor Number 

lumon. relative jcuc cop, ' ^^<^ V^^« 
adenocarcinoma -« "»J«> W q T^i^nN/Tnora lOhalih, 

Ml hy quantise PC* (FJJ. ^^SJ 
qwa erticnt ft tumor tisauc vtnea ™ wjB nj«w— ^ 

S ciminad compared with normal accent mucosa. 
S lTLorssho*ed greater than lCKo.dovae* P £».en. 
taSmiW. in 79% (15/19) of the lomo« 
"expression was significantly lower m the cum»rtb»cne 

S39S (12/111) or the colon njmnrs comparca with tno normal 
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Fie. 7. HOT RNA exorcsston in primor> ***** ^ 1^" 
Rbcve to expression in normal mucosa from the ^ patienl 

PCT V™ .4 of ft. tumor is fitted under .he 
do", in uipKcuie. lie experiment wis rcyeuied »■ l«»< i«"«>- 
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m «eou.The amount or overexprcssion rfWM «i«g.dfrom 
4- to >40-fold. 

DISCUSSION 

rw . OD ro»eh to understanding the molceula basis of eanotr 
One appro^" ,„ J£ e expression Between concer 
B i,° *? eSL ItratSi" bYsid or» assumptions Aat 
cells and nom ^ N ^' a.aer wiweco norms) and 

C57MT. mouse mammary epithelial cells UMiHormeg oy 
W ?bre* of the genes Mated. WKW, «W| wd 

cells infected with . Wnri retroviral 
««o, or S7MG cells expressine Wiit-1 under the control of 
rtebw™p4ihle promoter, and the second «u » 
W n ,-TS^ mice, wU breast 

^^^^cd ir^«y ^.< laducefl by polyoma 
StoD was ^detected mw™ J lll(Se ^5^^ 

indued by ^ ( ^|^^^ , tf. , S^ra 

or day* »fter Wnt-l trawfamatton. Thu.. I W5P ^""l 0 " 
coald rc t »H from W«.l Si E tialto£ directly * ro «S^^ 
WPScripdoo Uem repjlation or ^icnaurcly tough/ W« ^ 
^.ting tuming or t aawcrlptlOD taclo,. wh,ch u, tun, 

"The W^fdVfc.. on .oouional suWur-ily of the CCN Wt 
One striking tu-rcrence oten« I - tte 

wh.cto u pi e»enl in CTWh. v-f" ^ d 
TWi don».« u tioutf htio b. • nvah* P p U rdet-d«ired 

mey bind 

dittetent region of the molccolc th.i« the other CCN tjmiry 
S specinc receptor, have been idcntiCsd fejCTCF 
or oov. A recent report has shown th..l mte R rm '* 

wSn *o° fiiJase-..: W mor s^n.a in breaat tumor. , from 
W»t.l transeenic snimsli is consb^m »ilh previous oussr 
Z io.Ta^«eri F « for the related CKffv*™?* 
m.riii/ cKor«««d in the t brous siroois of miniinwy turners 
S*?EpXS?«-iU «e though, to control .he P«W«"'£ ' 
. liMu* stroma in tnirnmaiy tumorj by a eueade of 

S fae tor^U ttailer to thnt con-rnlHttg con«ect.ve 
8 - f„,««don during *ound renair. H IRS been proposed 
H^n^uSS ills or inflalnrMtDfy eett at rhc minor 
that mammery u>n\o« * . « h s , [mulU5 Jf|l - 

interstltlalinturfsce secret* lUh ^, rut. 

jrow'ft flctors that stimulate the production ot Cior- anu 

observed in Chd Siromal cell, tlmt jur-ounded the tumor cUlr 
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(eoithellal cells) In the Wnt-1 transonic mouse seeuonsof 
K^TtL finding <u e g«t, lh « P^T^Und 
Tould occur in which the stromal «»« ould ,^ p ^ T ^?^ d . 
W1SP-1 to regulate tumor ce« gtowtH on the WISP «i ecel- 
S ma£* Cornel cell-derived factors « «S^»" 
Matrix have boa populated to play a rate m tumor ecU 
Suon »»d proliL.ooa (34)- The location of HOT*-/ 
wd WtfM i« *• «»omal cell, of breast lumon supports Chit 

'TrS^MSM amplification «d "predion U, 

human coTon tumor, showed a Xl&lof & 

amplifieatloi. and overeapraasion, 

W RNA w M seen in the absence of DNA 

In contrvt. DNA woa amplified .n the color, i rumors 

but iu mRNA expression T ? CM ^n1n noma 
majority of ramorS compared with the ejrpreaaion in normal 
Ss m»«it Xfcm me same patient- fol hu "!?" 

WBM was localized 10 caroanrac 20ql 2-20,13, at a region 
frequently amplified and associated with poor P*9«nc«£ 
node negative breast cancer cod many colon ci^iw« 
be the existence of one or more oncoeenes at this le-cua 
(3 Beceuce the center of the TOqM wplcan haanol yol 
fecr. .dentified, it it possible that the apparent «wMiMU» 
observed for iWW n«» Be causa) uy another jene m thb 

"fZZ* manuscript on rCop-J. ^£"$^3 
WlSP-2 describee the leas of expression of cbis gene after cell 
Sorna tio™ auxins it may be a negative regular of 
Kowthtac ell ineT(l6). Although the aMBnin C-y wh Ch 
S?W RNA expression b down-regulated danng mahgnant 
tnrutonnuion ia unknown, the reduced expression of WU*-l 
to colon tuniOtS and cell line, aueeeata that « may 
a tumor suppressor. These reaulu show that the WOT Benee 
at. aberrantly expressefl In colon cancer and wisest that ihe.r 
•tend expression may confer aeleetWe growth advantage te 

,h VfcSbe« of the Wnt stealing pathway ha« been impli- 
6»tcd in the pathogenesis of colon cancer, breast 
melanoma, including me tumor suppressor gene adenoma oua 
polyposis eoli and jB-catanin (»). Mamma a jpcedeieooni 
li either gene can cause the stabilization and acwmulabon of 
cyioplasrnic ^tenin, which presumably comrlbytea to hu- 
man carcinogenesis through the aehvauotw >f W«^o *g 
*<. the WISPt. Although the mechanism by W" 1 ' 1 
f™£rml cciu and io'duc* tumour-si* * Sd^oWn' 
WenTification of WISP* aa genu that may be regulated down- 
ueam of Wr.fl " CS7MC ceU. suggest, they coul I M 
important mediator, of Wnt-1 transformation. The ^pUtica- 
oon end altered expression patterns of tto WISP aa i human 
COion tumors may indicate an important role for these genes 
in tumor development. 

Wc ihank the DNA sxnthart ;re w «li«o^i;^^^ n I- 
Bater CM technical Miimee. r. Dowd tor r»»«non_ hgmd mapp-ns. 
K Willcrt and R. NW far th* M^op.ir.ble Ci ,MO/ Wm,l cells. V. 
Li. fc" dUcu«ion.. and D. Wood »d A. HruCt ror artwork 

] cadia-n, K. M. S Mu-.i. R. (19"7) r.ems Dev. U. 3286-3305. 

3. Nu«. -R. dt varmus. H- 6. (1982) C<U 31. 99-109 

7. WofcnQar. M» v M dc Wearing. J*V 0c ^^ c V J( n M ^r« W 0°i 
Clevera, H. (1996) CeW 36, 3Pt^3V\>. 
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ABSTRACT The consistent cytogenetic translocation of 
chronic myelogenous leukemia (the Philadelphia chromosome, 
Ph 1 ) has been observed in cells of multiple hematopoietic 
lineages. This translocation creates a chimeric gene composed 
of breakpoint-cluster-region (tcr) sequences from chromosome 
22 fused to a portion of the abl oncogene on chromosome 0. The 
resulting gene product (P210 c ** b *) resembles the transforming 
protein of the Abelson murine leukemia virus In its structure 
and tyrosine kinase activity* PllO*"*" is expressed In Ph r - 
positive cell lines of myeloid lineage and in clinical specimens 
with myeloid predominance. We show here that Epstein-Barr 
virus-transformed B-Iymphocyte lines that retain Ph 1 can 
express P210 c "* bI , The level of expression in these B-cell lines is 
generally lower and more variable than that observed for 
myeloid lines. Protein expression is not related to amplification 
of the abl gene but to variation in the level of bcr-abl mRNA 
produced from a single Ph 1 template. 



Chronic myelogenous leukemia (CML) is a disease of the 
pluripotent stem cell (1). In greater than 95% of patients, the 
leukemic cells contain the cytogenetic marker known as the 
Philadelphia chromosome, or Ph 1 (2). This reciprocal 
translocation event between the long arms of chromosomes 
9 and 22 has been used as a disease- specific marker for 
diagnosis and evaluation of therapy. Multiple hematopoietic 
lineages, including myeloid and B-lymphoid, contain Ph 1 in 
early or chronic phase, as well as in the more acute accel- 
erated and blast crisis phases of the disease. 

One molecular consequence of Ph 1 is the translocation of 
the chromosomal arm containing the c-abl gene on chromo- 
some 9 into the middle of the breakpoint-cluster region (bcr) 
gene on chromosome 22 (3-6). Although the precise 
translocation breakpoints are variable, an RNA-spl icing 
mechanism generates a very similar 8-kilobase (kb) mRNA in 
each case (5-9). The hybrid bcr-abl message encodes a 
structurally altered form of the abl oncogene product, called 
P2io c ** bl (10-13), with an amino-terminal segment derived 
from a portion of the exons of bcr on chromosome 22 and a 
carboxyl-terminal segment derived from a major portion of 
the exons of the c-abl gene on chromosome 9. The chimeric 
structure of bcr-abl and the resulting P210 c " abl is similar to the 
structure of the Abelson murine leukemia virus gag-abl 
genome and resulting P160 v abl transforming gene product. 
Both proteins have very similar tyrosine kinase activities (10, 
11, 14) which can be distinguished by their relative stability 
to denaturing detergents and by their ATP requirements from 
the recently described tyrosine kinase activity of the c-abl 
gene product (15). 
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In concert with structural modification of the amino* 
terminal portion of the abl gene, increased level of expression 
has been implicated in activation of c-abl oncogenic poten- 
tial. Myeloid and erythroid cell lines and clinical samples 
derived from acute-phase CML patients contain about 10- 
fold higher leyels of the 8-kb bcr-abl mRNA and P21<r >aW than 
the c-abl mRNA forms (6 and 7 kb) and P145 c ,bl gene product 
(5, 8, 9, li). The higher level of expression of the chimeric 
bcr-abl message in acute-phase cells is not likely to be solely 
due to the presence of the bcr promoter sequences at the 5 ' 
end of the gene,* since the normal 4.5-kb and 6.7-kb bcr- 
encoded mRNA species are expressed at an even lower level 
than the normal c-abl messages (5, 6).. 

We have analyzed a series of Epstein-Barr virus-immor- 
talized B-lymphoid cell lines derived from CML patients (16). 
With such in vitro clonal cell lines, we can evaluate whether 
the presence of Ph 1 always results in synthesis of the chimeric 
bcr-abl message and protein, and whether the quantitative 
expression varies for cells of B-lymphoid lineage as com- 
pared to previously examined myeloid cell lines. Our results 
show that cell lines that retain Ph* do express bcr-abl message 
and protein, but that the level is generally lower and more 
variable than previously seen for myeloid cell lines. The 
demonstration that the Ph 1 chromosomal template can vary 
in its level of expression of P210 cabl suggests that secondary 
mechanisms, beyond the translocation itself, contribute to 
the regulation of the bcr-abl gene in different cell types or 
subclones that derive from the affected stem cell. 

MATERIALS AND METHODS 

Cells and Cell Labelings. Epstein-Barr virus-transformed 
B-lymphoid cell lines were established from peripheral blood 
samples of chronic- and acute-phase CML patients as report" 
ed (16). The ceil lines are designated according to patient 
number, karyotype, and lineage. For example, SK- 
CML7Bt(9,22)-33 refers to CML patient 7, B-lymphoid cell 
line, 9;22 translocation (Ph 1 ), cell line 33; and SK-CML7BN- 
2 refers to B-cell line 2 with a normal karyotype derived from 
the same patient. Repeat karyotype analysis was performed 
to verify the retention of Ph 1 just prior to analysis for abl 
protein and RNA. Cells were maintained in RPMI 1640 
medium with 20% fetal bovine serum. We have not observed 
any consistent pattern of in vitro growth rate that correlates 
to the stage of disease at the time of transformation with 
Epstein-Barr virus. Cells (1.5 x 10 7 ) were washed twice with 
Dulbecco's modified Eagle's medium lacking phosphate and 



Abbreviations: bcr, breakpoint-cluster region; CML, chronic 
myelogenous leukemia; kb, kilobase(s). 

^Present address: Department of Genetics, University of Washing- 
ton, Seattle, WA 98195. 
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supplemented with 5% dialyzed fetal bovine serum. Cells 
were then resuspended in 2 ml of the minimal medium. 
Labeling was started with the addition of [ 32 P]orthophos- 
phate (1 mCi/ml; ICN; 1 Ci = 37 GBq) and continued at 37°C 
for 3-4 hr. 

Immunoprecipitation and Immunoblotting. Immunoprecip- 
itations were carried out as described (10). Cells (1.5 x JO 7 ) 
were washed with phosphate-buffered saline and extracted 
with 3-5 ml of phosphate lysis buffer (1% Triton X-100/0.1 
NaDodSO 4 /0.5% deoxycholate/10 mM Na 2 HP0 4t pH 7.5/ 
100 mM NaCl) with 5 mM EDTA and 5 mM phenylmethyl- 
sulfonyl fluoride. Extracts were clarified by centrifugation 
and precipitated with normal or rabbit anti-abl sera (anti- 
pEX-2 or anti-pEX-5) (17). The precipitated proteins were 
electrophoresed in a NaDodS0 4 /8% polyaciylarnide gel. 
32 P-labeled proteins were detected by autoradiography. 
Alternatively, abl proteins were detected by immunoblotting. 
Extracts from unlabeled ceils were clarified, and proteins 
were concentrated by immunoprecipitation with rabbit anti- 
sera against aM-encoded proteins [anti-pEX-2 and anti-ppX- 
5 combined (17)] and then fractionated in 8% acrylamide gels. 
The proteins were transferred from the gel to nitrocellulose 
filters, using protease-facilitated transfer (18). The abl- 
encoded proteins were detected using murine monoclonal 
antibodies as a probe and peroxidase-conjugated goat anti- 
mouse second stage antibody (Bio-Rad) for development. 
Rabbit antisera and mouse monoclonal antibodies to abl 
proteins were prepared using bacterially expressed regions of 
the v-abl protein as immunogens (17, 19). Anti-pEX-2 anti- 
bodies react with the internal tyrosine kinase domain and 
anti-pEX-5 antibodies react with the carboxyl-terminal seg- 
ment of the abl proteins. 

RNA Analysis. RNA was extracted from 10* cells by the 
NaDodS0 4 /urea/phenol method (20). Polyadenylylated 
RNA was purified by oligo(dT) affinity chromatography. 
Samples were electrophoresed in a 1% agarose/formalde- 
hyde gel and transferred to nitrocellulose, abl RNA species 
were detected by hybridization with a nick-translated v-abl 
fragment probe (21). 

DNA Analysis. DNA was prepared from 5 x 10 7 cells of 
each cell line and processed for Southern blots with a v-abl 
probe as described (21). 

RESULTS 

Variable Levels of P210 cabI Are Detected in Ph'-Positive Cell 
Lines. Ph^positive and Ph^negative, Epstein-Barr virus- 
transformed B-lymphocyte cell lines derived from the same 
patient were examined for P210 cabl synthesis by immuno- 
precipitation of [ 3i P]orthophosphate-labeled cell extracts 
with anti-abl sera (Fig. 1). The normal c-abl protein P145 c_abl 
was detected at a similar level in multiple Ph ^positive and 
Ph l -negative cell lines. P210 cabl was only detected in the 
Ph l -positive cell lines because the bcr-abl chimeric gene 
which encodes P210 cabl resides on the Ph 1 (4, 5, 11, 13). The 
level of P210 c abl was about 4- to 5-fold higher than the level 
of P145 cabl in the SK-CML7Bt-33 cell line (Fig. U, +). The 
Ph^positive erythroid-progenitor cell line K562 (C) showed 
a level of P210 cabI about 10-fold higher than P145 cabl . 
However, the level of P210 c abl was about one-fifth that of 
P145 cabI in the Ph l -positive SK-CML16BM cell line (Fig. 15, 
+). Comparison of different autoradiographic exposures 
roughly indicated that the level of P210 cabl varies over a 
20-fold range between these Ph^positive B-cell lines. Anal- 
ysis of four additional Ph^positive B-cell lines demonstrated 
that the level of P210 c * abl fell into two general classes; some 
cell lines had a level of P210 c - abl similar to SK-CML7Bt-33 
and others had the low level similar to SK-CML16BM (Table 
1). This differs from previous studies with Pr^-positive 
myeloid cell lines and patient samples derived from acute- 
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Fio. 1. Detection of variable levels of P210 c -* u in Ph l -positive 
B-cell lines. Production of and nw** in Epstein-Barr 

virus-transformed B-cell lines derived from a blast-crisis (A) and a 
chronic-phase (£) CML patient was examined by metabolic labeling 
with ["pjorthophosphate and immunoprecipitation. Ph ! -negative 
(-) and Ph'-positive (+) cell lines derived from each patient were 
analyzed. The Ph^negative cell line in A,- is SK-CML7BN-2 and in 
B t - is SK-CML16BN-1. The Ph l -positive cell line in A,+ is 
SK-CML7Bt-33 and in $,+ is SK-CML16Bt-l. The K562 cell line, a 
Ph l -positivc erythroid progenitor cell line spontaneously derived 
from a blast-crisis patient (33), is represented in C. Cells (1.5 x 10 7 ) 
were metabolicaily labeled with 2 mCi of [ 32 P]orthophosphate for 3-4 
hr and then were extracted and clarified by centrifugation. Samples 
were immunoprecipitated with control normal serum (lanes 1), 
anti-pEX-2 (lanes 2), or anti-pEX-5 (lanes 3) and analyzed by 
NaDodS04/896 PAGE followed by autoradiography with an inten- 
sifying screen (3 days for A and C, 10 days for B). 

phase CML patients, in which P210 c * bl was detected at a 
10-fold higher level than P145* 1 " (refs. 10 and 11; Table 1). 
There was no large difference in level of chimeric mRNA and 
P2i0c-»M eX p resse( j m f our myeloid/erythroid-lineage Ph 1 - 
positive cell lines (K562, EM2, EM3, CML22, and BV173; 
refs. 9 and 11), despite a 4- to 5-fold amplification of 
cW-related sequences in the K562 cell line. 

Detection of different levels of P210 C ftbl in Fig. 1 could be 
due to decreased phosphorylation of P210 c abl f a lower level 
of P210 c * abl synthesis, or altered stability of the protein. To 
help distinguish among these possibilities, the steady-state 
level of P210 c " abl in the cell lines was assayed by immuno- 
blotting. The results show that SK-CML7Bt,33 (Fig. 2A, +) 
had a higher level of P210 c abI than P145, similar to the results 
with metabolic labeling (Fig. 1). We did not detect P210 c abl 
by immunoblotting with 2 x 10 7 cells of line SK-CML8Bt-3 
(Fig. IB, +). Reconstruction experiments using dilutions of 
cell extracts showed that we could detect about 5-10% the 
level of P210 c abI expressed in the K562 cell line (data not 
shown). We infer that the steady-state level of P210 c abI in 
SK-CML8Bt-3 is lower than the level in SK-CML7Bt-33 by 
a factor of at least 10. The level of P210 c abl detected in these 
assays correlated with the amount of P210 c * abI tyrosine kinase 
activity that could be detected in vitro (data not shown). 

Different Levels of P210 c aU Are Reflected In the Amount of 
Stable bcr-abl mRNA. To identify the basis for detection of 
variable levels of P210 c abl , we examined the production of 
the abl RNA. RNA blot hybridization analysis using a v-abl 
probe (Fig. 3) showed that the normal 6- and 7-kb c-abl 
mRNAs were present at a similar level in Ph^positive and 
-negative cell lines derived from different patients. However, 
the 8-kb mRNA that encodes P210 cabi was detected at a 
10-fold higher level in SK-CML7Bt-33 (Fig. 3A, +) than in 
SK-CML16BM (fl, +), which correlated with the relative 
level of P210 cabl detected in each cell line. Analysis of 
additional cell lines demonstrated that the level of 8-kb RNA 
directly correlated with the level of P210 c " abl (Table 1). The 
variation in level of 8-kb RNA detected in these cell lines was 
not due to loss or gain of Ph 1 , because cytogenetic analysis 
confirmed the presence of Ph 1 in these cell lines (ref. 16 and 
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Table I. Relative levels of bcr-abl expression in Epstein-BaiT 
virus-immortalized B-cell lines and myeloid CML lines 

8*T 



Cell line* 


CML phase* 


Ph 1 * 


P2108 


mRNA* 


SK-CMI 7BN-2 


BC 








SK-CML8BN-10 


f hrnnic 

Vftll UU1W 








SK-CML8BN-12 


Chronic 








SK-CML16BN-1 


Chronic 








SK-CML35BN-1 


Chronic 








SK-CML7B5-33 


BC 


+ 


+ + + 


+ + + 


SK-CML21BM 


Acc 


+ ■ 


+ + + 


+ + + 


SK-CML21Bt-6 


Acc 


+ 


+ + + 


+ + + 


SK-CML8Bt-3 


Chronic 




+ 




SK-CML16BM 


Chronic 


+ 


+ 


+ 


SK-CML35Bt-2 


Chronic 


+ 


+ 


+ 


K562 


BC 


+ 


+ + + + + 


+ + + + + 


BV173 


BC 


+ 


+ + + + + 


+ + + + + 


EM2 


BC 


+ 


+ + + + + 


+ + + + + 



•Cell lines derived from CML patients by transformation with 
Epstein-Barr virus as described (16). Names of cell lines indicate 
patient number and Ph 1 status: SK-CML7Bt indicates a cell line 
derived from patient 7 that carries the 9\22 Ph 1 translocation; N 
indicates a normal karyotype. Myeloid-erythroid cell lines (K562, 
EM2, and BV173) are described in previous publications (9, 11, 22, 
33). 

•Status of patient at the time cell line was derived. BC, blast crisis; 
Acc, accelerated phase. 

♦Presence (+) or absence (-) of Ph 1 as demonstrated by karyotypic 
or Southern blot analysis. 

(P2ifje-*bi detected as described in legend to Fig. 1. B-cell lines 
derived from blast-crisis and accelerated-phase patients had levels 
of P210 3- to 5-fold higher (+ ++) than levels of P145. Chronic- 
phase-derived cell lines had P210 levels lower than or just equivalent 
(+) to the level of P145. Myeloid and erythroid lines had levels of 
P210 5- to 10-fold higher than P145 (+ ++++). 
^Eight-kilobase bcr-abl mRNA detected as described in legend to 
Fig. 2. Symbols: ±, borderline detectable; + + + + + , level of 8-kb 
mRNA 5- to 10-fold higher than that of the 6- and 7-kb c-abl mRNA 
species; + + + , level of 8-kb mRNA 3- to 5-fold higher than that of 
the 6- and 7-kb species; + , a level approximately equivalent to that 
of the 6- and 7-kb messages. 

data not shown). There was no difference in the copy number 
of aW-related sequences as judged by Southern blot analysis 
(Fig. 4). Only the K562 cell line control showed an amplifi- 
cation of abl sequences, as previously reported (22, 23). 
These combined data suggest that differential bcr-abl mRNA 
expression from a single gene template is responsible for the 
variable levels of P210 c *' bl detected. This could be mediated 




Fig. 2. Analysis of steady-state abl protein levels by immuno- 
blotting. Cell extracts prepared from 2 x 10 7 cells of lines SK- 
CML7BN-2 (A,-), SK-CML7Bt-33 (A,+), SK-CML8BN-10 <£,-), 
and SK-CML8Bt-3 (B,+) were concentrated by immunoprecip- 
itation with anti-pEX-2 plus anti-pEX-5. Samples were then electro- 
phoresed in a NaDodS0 4 /8% polyacrylamide gel and transferred to 
nitrocellulose, using protease-facilitated transfer (18). abl proteins 
were detected using a mixture of two monoclonal antibodies directed 
against the pEX-2 and pEX-5 aW-protein fragments produced in 
bacteria (19) as a probe and a peroxidase-conjugated goat anti-mouse 
second-stage antibody (Bio-Rad) for development. 



A B 

- + - + 



c 
+ 



kb 

-8 
-7 
-6 



ni 

Fig. 3. Comparison of abl RNA levels in Ph l -positive and 
•negative B-cell lines. The levels of the normal 6- and 7-kb c-abl 
RNAs and the 8-kb bcr-abl RNA were analyzed by blot hybridization 
using a v-abt probe. RNA was extracted from Ph'-negau've lines 
SK-CML7BN-2 (A,-) and SK-CML16BN-1 (B f -) ( from Ptf-pos- 
itive lines SK-CML6Bt-33 (A,+) and SK-CML16Bt-3 (B,+), and 
from line K562 (C,+) by the NaDodS0 4 /urea/phenol method (20). 
Polyadenylylated RNA was purified by oligo(dT) affinity chroma- 
tography, and 15 fig of each sample was electrophoresed in a 1% 
agarose/formaldehyde gel and then transferred to nitrocellulose. The 
blotted RNAs were hybridized with a nick-translated v-aW fragment 
probe (21) and then autoradiographed for 4 days. 



by factors influencing the transcription rate of the bcr-abl 
gene or the stability of the mRNA. 

DISCUSSION 

Several lines of evidence suggest that formation of Ph 1 is not 
the primary event that affects the stem cell in CML. Patients 
have been identified that present with the clinical picture of 
CML but only later develop Ph 1 (1). This observation, 
coupled with studies of G6PD (glucose-6-phosphate dehy- 
drogenase)-heterozygous females with CML that demon- 
strate stem-cell clonality by isozyme analysis among cell 
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Fig. 4. Southern blot analysis of abl sequences in Propositi ve 
and -negative B-cell lines. High molecular weight DNA (15 jig) was 
digested with restriction endonuclease BamHl, separated in a 0.8% 
agarose gel, and then transferred to nitrocellulose. The blotted DNA 
fragments were hybridized with a nick-translated, 2.4-kb Bgt II v-oW 
fragment (1.5 x 10 8 cpm/Vg; ref, 21) and exposed for 4 days. (A) 
Autoradiogram of oM-specific fragments in cell lines HL-60 (lane 1), 
EM2 (lane 2), K562 (lane 3), SK-CML7Bt-33 (lane 4), SK-CML8Bt-3 
(lane 5), SK-CML16Bt-l (lane 6), SK-CML21Bt-6 (lane 7), SK- 
CML35Bt-2 (lane 8), SK-CML7BN-2 (lane 9), SK-CML8BN-2 (lane 
10), and SK-CML35BN-1 Oane 11). (B) Ethidium bromide staining of 
agarose gel prior to transfer to nitrocellulose, showing the level of 
variation in amount of DNA loaded per lane. 
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populations that lack the Ph 1 marker, supports a secondary 
or complementary role for Ph 1 in the progression of the 
disease (24, 25). This chromosome marker is found in 
chronic, accelerated, and blast-crisis phases of the disease. It 
is likely that Ph 1 confers some growth advantage, since cells 
with the marker chromosome eventually predominate the 
marrow and peripheral blood even in chronic phase. During 
the phase of blast crisis, many patients develop additional 
chromosome abnormalities, including duplication of Ph 1 , a 
variety of trisomies, and complex translocations (26). This 
is suggestive evidence for Ph 1 being a necessary but not 
sufficient genetic change for the full evolution of the 
disease. 

The realization that one molecular result of Ph 1 is the 
generation of a chimeric bcr-abl protein with functional 
characteristics and structure analogous to the gag-abl trans- 
forming protein of the Abelson murine leukemia virus 
strengthens the argument for an important role of Ph 1 in the 
pathogenesis of CML. Although the Abelson virus is gener- 
ally considered a rapidly transforming retrovirus, its effects 
can range from overcoming growth factor requirements, to 
cellular lethality, to induction of highly oncogenic tumors in 
a number of hematopoietic cell lineages (27, 28). Even in the 
transformation of murine cell targets, there are several lines 
of evidence that suggest that the growth-promoting activity of 
the \-abl gene product is complemented by further cellular 
changes in the production of the malignant-cell phenotype 
(29-31). 

The regulation of bcr-abl gene expression is complex 
because the 5' end of the gene is derived from the non-abl 
sequences, bcr 9 normally found on chromosome 22 (6). The 
level of stable message for the normal bcr gene and the 
normal abl gene are both much lower than the level of the 
bcr-abl message and protein from cell lines and clinical 
specimens derived from myeloid blast-crisis patients (5, 6, 
11). Therefore, the high level of bcr-abl expression cannot 
simply be attributed to the regulatory sequences associated 
with bcr. Possibly, creation of the chimeric gene disrupts the 
normal regulatory sequences and results in a higher level of 
expression. Variation in bcr-abl expression may result from 
secondary changes in the structure of the chimeric gene or 
function of /ran j-acting factors that occur during evolution of 
the disease. Our analysis of P210 c * ttbI and the 8-kb mRNA in 
Epstein-Barr virus-transformed Ph ^positive B-cell lines 
demonstrates that stable message and protein levels from the 
bcr-abl gene can vary over a wide range. This variation does 
not result from a change in the number of bcr-abl templates 
secondary to gene amplification but more likely from changes 
in either transcription rate or mRNA stability. We suspect 
this range of bcr-abl expression is not limited to lymphoid 
cells. Analysis of peripheral blood leukocytes derived from 
an unusual CML patient who has been in chronic phase with 
myeloid predominance for 16 years showed a level of 
P210 c " bl one-fifth that of P145 c " ab \ as detected by metabolic 
labeling with [ 32 P]orthophosphate and immunoprecipitation 
(S.C., O.N.W., and P. Greenberg, unpublished observa- 
tions). Lower levels of expression of the chimeric mRNA 
have been demonstrated in clinical samples from chronic- 
phase CML patients compared to acute-phase CML patients 
(9). Others have reported chronic-phase patients with vari- 
able but, in some cases, relatively high levels of the bcr-abl 
mRNA (32). The sampling variation and the heterogenous 
mixture of cell types in clinical samples complicate such 
analyses. Further work is needed to evaluate whether there 
is a defined change in P210 c " abl expression during the pro- 
gression of CML. It is interesting to note that among the 
limited sample of Ph^positive B-cell lines we have examined 
(Table 1), we have seen higher levels of P210 c abl in those 
derived from patic ^ at more advanced stages of the disease. 
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It will be important to search for cell-type-specific mecha- 
nisms that might regulate expression of bcr-abl from Ph 1 . 
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Review 

Paul a. Haynes Proteome analysis: Biological assay or data archive? 

Steven P. Gygi 

Daniel Figeys j n th j s rev i ew we examine the current state of proteome analysis. There are 

Ruedi Aebersold tnree ma j n j ssues discussed: why it is necessary to study proteomes; how pro- 

teomes can be analyzed with current technology; and how proteome analysis 
Department of Molecular can ^ e used t0 en hance biological research. We conclude that proteome anal- 

Biotechnology, University of ys j s j s an essen tj a | tool in the understanding of regulated biological systems. 

Washington, Seattle, WA, USA Current technology, while still mostly limited to the more abundant proteins, 

enables the use of proteome analysis both to establish databases of proteins 
present, and to perform biological assays involving measurement of multiple 
variables. We believe that the utility of proteome analysis in future biological 
research will continue to be enhanced by further improvements in analytical 
technology. f 

Contents resolution two-dimensional gel electrophoresis (2-DE), 

detected in the gel and identified by their amino acid 

1 introduction i»w seq ue nee. The ease, sensitivity and speed with which gel- 

2 Rationale for proteome analysis 1862 arated roteins can De ident ified by the use of recently 

2.1 Correlation between mRNA and protein developed mass spectrometry techniques have dramati- 
expression levels l8W cally increased the interest in proteome technology. One 

2.2 Proteins are dynamically modified and pro- of the most attractive features ofsuch analyses is that com- 
^ i 5f SS A * * ; ; , * ,, n ' , ;" i :" plex biological systems can potentially be studied in their 

2.3 Proteomes are dynamic and reflect the entirety, rather than as a multitude of individual compo- 
state of a biological system 1863 nents . ^ makes it f ar easier to uncover the many com- 

3 Description and assessment of current pro- pIeXj and often obscurCj relationships betwe en mature 
teome analysis technology 1863 products in ceils. Large-scale proteome charapteriza- 

3.1 Technical requirements of proteome tech- tkm projects have beep undertaken for a number of dif . 

?? , '"i — 1 1 ferent organisms and cell types. Microbial proteome pro- 

3.2 2D electrophoresis - mass spec rometry : a jects currently in progress include , for example: Sdccharo- 
common implementation of proteome anal- myces cerMe [2]j Salmonella enterica [3], Spiro plasma 

d SIS * '" : j'". : fi"7"u"iruc;uc V melliferum [4], Mycobacterium tuberculosis [5], Ochrobac- 

3.3 Protein identification by LC-MS/MS, capil- , , „„ tU ^ n ; r/;i n„*„*„ n un,i« i^fi,,**™* m ^*»^u~ 

i«™ i^\jfv/\*Q nn A nn uc/K/r iQ/rc trum antnropi [b\ t Haemophilus influenzae l/J, bynecno- 

i 1 i ^ CE-MS/MS 865 ^ spp [g]j Escher ichia colt [9], Rhizobium legumino-. 

if i p^ i r \l« sarum ll01 » and ^ostelium discoideum [11]. Proteome 

if i ru flic^iuc laS Projects underway for tissues of more complex organ- 

A V f* Wnruc'""; I'M" isms include those for: human bladder squamous cell 

3.4 Assessment of 2-DE-MS proteome tech- carcinomas [121. human liver [13], human plasma [13], 

?u ?F ; ■ " ; / * V V 7 human keratinocytes [12], human fibroblasts [12], mouse 

4 Utility of proteome analysis for biolo gI cai kjdney [12L and fat serum [HJ Jn th[s manuscript we cri . 

a i ( [5 searcn * ^„ ticaily assess the concept of proteome analysis and the 

4.1 The pro eome as a database 868 technical feasibiUty of establishing complete proteome 

4.2 The proteome as a biological assay .... 868 and discuss , which proteome anal is and 

5 Concluding remarks 870 bio ^ ical research intersect 

6 References 1870 



1 Introduction 

A proteome has been defined as the protein complement 
expressed by the genome of an organism, or, in multicel- 
lular organisms, as the protein complement expressed by a 
tissue or differentiated cell [1]. In the most common im- 
plementation of proteome analysis the proteins extracted 
from the cell or tissue analyzed are separated by high 

Correspondence: Professor Ruedi Aebersold, Department of Molecular 
Biotechnology, University of Washington, Box 357730, Seattle, WA, 
98195, USA (Tel: +206-685-4235; Fax: +206-685-6392; E-mail: ruedi 
©u.washington. edu) 

Abbreviations: CID, collision-induced dissociation; MS/MS, tandem 
mass spectrometry; SAGE, serial analysis of gene expression 

Keywords: Proteome / Two-dimensional polyacrylamide gel electro- 
phoresis / Tandem mass spectrometry' 



2 Rationale for proteome analysis 

The dramatic growth in both the number of genome 
projects and the speed with which genome sequences 
are being determined has generated huge amounts of 
sequence information, for some species even complete 
genomic sequences ((15—17)). The description of the 
state of a biological system by the quantitative measure- 
ment of system components has long been a primary 
objective in molecular biology. With recent technical 
advances including the development of differential dis- 
play-PCR [18], cDNA microarray and DNA chip techno- 
logy [19, 20] and serial analysis of gene expression 
(SAGE) [21, 22], it is now feasible to establish global and 
quantitative mRNA expression maps of cells and tissues, 
in which the sequence of all the genes is known, at a 
speed and sensitivity which is not matched by current 
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protein analysis technology. Given the long-standing 
paradigm in biology that DNA synthesizes RNA which 
synthesizes protein, and the ability to rapidly establish 
global, quantitative mRNA expression maps, the ques- 
tions which arise are why technically complex proteome 
projects should be undertaken and what specific types of 
information could be expected from proteome projects 
which cannot be obtained from genomic and transcript 
profiling projects. We see three main reasons for pro- 
teome analysis to become an essential component in the 
comprehensive analysis of biological systems, (i) Protein 
expression levels are not predictable from the mRNA 
expression levels, (ii) proteins are dynamically modified 
and processed in ways which are not necessarily 
apparent from the gene sequence, and (iii) proteomes 
are dynamic and reflect the state of a biological system. 

2.1 Correlation between mRNA and protein expression 
levels 

Interpretations of quantitative mRNA expression profiles 
frequently implicitly or explicitly assume that for specific 
genes the transcript levels are indicative of the levels of 
protein expression. As part of an ongoing study in our 
laboratory, we have determined the correlation of expres- 
sion at the mRNA and protein levels for a population of 
selected genes in the yeast Saccharomyces cerevisiae 
growing at mid-log phase (S. P. Gygi et al y submitted for 
publication). mRNA expression levels were calculated 
from published SAGE frequency tables [22]. Protein 
expression levels were quantified by metabolic radiola- 
beling of the yeast proteins, liquid scintillation counting 
of the protein spots separated by high resolution 2-DE 
and mass spectrometric identification of the protein(s) 
migrating to each spot. The selected 80 samples consti- 
tute a relatively homogeneous group with respect to pre- 
dicted half-life and expression level of the protein pro- 
ducts. Thus far, we have found a general trend but no 
strong correlation between protein and transcript levels 
(Fig. 1). For some genes studied equivalent mRNA trans- 
cript levels translated into protein abundances which 
varied by more than 50-fold. Similarly, equivalent steady- 
state protein expression levels were maintained by trans- 
cript levels varying by as much as 40-fold (S. P. Gygi 
et a/., submitted). These results suggests that even for a 
population of genes predicted to be relatively homoge- 
neous with respect to protein half-life and gene expres- 
sion, the protein levels cannot be accurately predicted 
from the level of the corresponding mRNA transcript. 

2.2 Proteins are dynamically modified and processed 

In the mature, biologically active form many proteins are 
post-translationally modified by glycosylation, phosphor- 
ylation, prenylation, acylation, ubiquitination or one or 
more of many other modifications [23) and many pro- 
teins are only functional if specifically associated or com- 
plexed with other molecules, including DNA, RNA, pro- 
teins and organic and inorganic cofactors. Frequently, 
modifications are dynamic and reversible and may alter 
the precise three-dimensional structure and the state of 
activity of a protein. Collectively, the state of modifica- 
tion of the proteins which constitute a biological system 
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Figure 1. Correlation between mRNA and protein levels in yeast celts. 
For a selected population of 80 genes, protein levels were measured 
by 3S -S-radiolabeling and mRNA levels were calculated from publi- 
shed SAGE tables. Inset: expanded view of the low abundance region. 
For more experimental details, also see Figs. S and 6, (S. P. Gygi et al., 
submitted). 

are important indicators for the state of the system. The 
type of protein modification and the sites modified at a 
specific cellular state can usually not be determined 
from the gene sequence alone. 

2.3 Proteomes are dynamic and reflect the state of a 
biological system 

A single genome can give rise to many qualitatively and 
quantitatively different proteomes. Specific stages of the 
cell cycle and states of differentiation, responses to 
growth and nutrient conditions, temperature and stress, 
and pathological conditions represent cellular states 
which are characterized by significantly 'different pro- 
teomes. The proteome, in principle, also reflects events 
that are under translational and post-translational con- 
trol. It is therefore expected that proteomics will be able - 
to provide the most precise and detailed molecular des- 
cription of the state of a cell or tissue, provided that the 
external conditions defining the state axe carefully deter- 
mined. In answer to the question of whether the study 
of proteomes is necessary for the analysis of biomolec- 
ular systems, it is evident that the analysis of mature pro- 
tein products in cells is essential as there are numerous 
levels of control of protein synthesis; degradation, 
processing and modification, which are only apparent by 
direct protein analysis. 



3 Description and assessment of current proteome 
analysis technology 

3.1 Technical requirements of proteome technology 

In biological systems the level of expression as well as 
the states of modification, processing and macro-molec- 
ular association of proteins are controlled and modu- 
lated depending on the state of the system. Comprehen- 
sive analysis of the identity, quantity and state of modifi- 
cation of proteins therefore requires the detection and 
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quantitation of the proteins which constitute the system, 
and analysis of differentially processed forms. There are 
a number of inherent difficulties in protein analysis 
which complicate these tasks. First, proteins cannot be 
amplified. It is possible to produce large amounts of a 
particular protein by over-expression in specific cell sys- 
tems. However, since many proteins are dynamically 
post-translationally modified, they cannot be easily am- 
plified in the form in which they finally function in the 
biological system. It is frequently difficult to purify from 
the native source sufficient amounts of a protein for 
analysis. From a technological point of view this trans- 
lates into the need for high sensitivity analytical tech- 
niques. Second, many proteins are modified and pro- 
cessed post-translationally. Therefore, in addition to the 
protein identity, the structural basis for differentially 
modified isoforms also needs to be determined. The dis- 
tribution of a constant amount of protein over several 
differentially modified isoforms further reduces the 
amount of each species, available for analysis. The com- 
plexity and dynamics of post-translational protein edit- 
ing thus significantly complicates proteome studies. 
Third, proteins vary dramatically with respect to their 
solubility in commonly used solvents. There are few, if 
any, solvent conditions in which all proteins are soluble 
and which are also compatible with protein analysis. This 
makes the development of protein purification methods 
particularly difficult since both protein purification and 
solubility have to be achieved under the same condi- 
tions. Detergents, in particular sodium dodecyl sulfate 
(SDS), are frequently added to aqueous solvents to 
maintain protein solubility. The compatibility with SDS 
is a big advantage of SDS polyacrylamide gel electro- 
phoresis (SDS-PAGE) over other protein separation 
techniques. Thus, SDS-PAGE and two-dimensional gel 
electrophoresis, which also uses SDS and other deter- 
gents, are the most general and preferred methods for 
the purification of small amounts of proteins, provided 
that activity does not necessarily need to be maintained. 
Lastly, the number of proteins in a given cell system is 
typically in the thousands. Any attempt to identify and 
categorize all of these must use methods which are as 
rapid as possible to allow completion of the project 
within a reasonable time frame. Therefore, a successful, 
general proteomics technology requires high sensitivity, 
high throughput, the ability to differentiate differentially 
modified proteins, and the ability to quantitatively dis- 
play and analyze all the proteins present in a sample. 

3.2 2-D electrophoresis — mass spectrometry: a common 
implementation of proteome analysis 

The most common currently used implementation of 
proteome analysis technology is based on the separation 
of proteins by two-dimensional (IEF/SDS-PAGE) gel 
electrophoresis and their subsequent identification and 
analysis by mass spectrometry (MS) or tandem mass 
spectrometry (MS/MS). In 2-DE, proteins are first separ- 
ated by isoelectric focusing (IEF) and then by SDS- 
PAGE, in the second, perpendicular dimension. Separ- 
ated proteins are visualized at high sensitivity by staining 
or autoradiography, producing two-dimensional arrays of 
proteins. 2-DE gels are, at present, the most commonly 
used means of global display of proteins in complex 



Electrophoresis 1998, 79, 1862-1871 

samples. The separation of thousands of proteins has 
been achieved in a single gel [24, 25] and differentially 
modified proteins are frequently separated. Due to the 
compatibility of 2-DE with high concentrations of deter- 
gents, protein denaturants and other additives promoting 
protein solubility, the technique is widely used. 

The second step of this type of proteome analysis is the 
identification and analysis of separated proteins. Individ- 
ual proteins from polyacrylamide gels have traditionally 
been identified using /V-terminal sequencing [26, 27], 
internal peptide sequencing [28, 29], immunoblotting or 
comigration with known proteins [30]. The recent dra- 
matic growth of large-scale genomic and expressed 
sequence tag (EST) sequence databases has resulted in^a 
fundamental change in the way proteins are identified ly 
their amino acid sequence. Rather than by the traditional 
methods described above, protein sequences are now fre- 
quently determined by correlating mass spectral or 
tandem mass spectral data of peptides derived from pro- 
teins, with the information contained in sequence data- 
bases [31-33]. 

There are a number of alternative approaches to pro- 
teome analysis currently under development. There is 
considerable interest in developing a proteome analysis 
stragegy which bypasses 2-DE altogether, because it is 
considered a relatively slow and tedious process, and 
because of perceived difficulties in extracting proteins 
from the gel matrix for analysis. However, 2-DE as a 
starting point for proteome analysis has many advan- 
tages compared to other techniques available today. The 
most significant strengths of the 2-DE-MS approach 
include the relatively uniform behavior of proteins in 
gels, the ability to quantify spots and the high resolution 
and simultaneous display of hundreds to thousands of 
proteins within a reasonable time frame. 

A schematic diagram of a typical procedure of the identi- 
fication of gel-separated proteins is shown in Fig. 2. Pro- 
tein spots detected in the gel are enzymatically or chemi- 
cally fragmented and the peptide fragments are isolated 
for analysis, as already indicated, most frequently by MS 
or MS /MS. There are numerous protocols for the gener- 
ation of peptide fragments from gel-separated proteins. 
They can be grouped into two categories, digestion in 
the gel slice [28, 34] or digestion after electro transfer out 
of the gel onto a suitable membrane ([29, 35-37] and 
reviewed in [38]). In most instances either technique is 
applicable and yields good results. The analysis of MS or 
MS/MS data is an important step in the whole process 
because MS instruments can generate an enormous 
amount of information which cannot easily be managed 
manually. Recently, a number of groups have developed 
software systems dedicated to the use of peptide MS 
and MS/MS spectra for the identification of proteins. 
Proteins are identified by correlating the information 
contained in the MS spectra of protein digests or 
MS/MS spectra of individual peptides with data con- 
tained in DNA or protein sequence databases. 

The systems we are currently using in our laboratory are 
based on the separation of the peptides contained in pro- 
tein digests by narrow bore or capillary liquid chromatog- 
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Figure 2. Schematic diagram of a procedure for identification of gel- 
separated proteins. Peptides can either be separated by a technique 
such as LC or CE, or infused as a mixture and sorted in the MS. Data- 
base searching can either be performed on peptide masses from an 
MS spectrum, peptide fragment masses from CID spectra of peptides, 
or a combination of both. 



raphy [39, 40] or capillary electrophoresis [41], the anal- 
ysis of the separated peptides by electrospray ioniza- 
tion (ESI) MS /MS, and the correlation of the generated 
peptide spectra with sequence databases using the 
SEQUEST program developed at the University of Wash- 
ington [32, 33], The system automatically performs the 
following operations: a particular peptide ion character- 
ized by its mass-to-charge ratio is selected in the MS out 
of all the peptide ions present in the system at a parti- 
cular time; the selected peptide ion is collided in a colli- 
sion cell with argon (collision-induced dissociation, 
CID) and the masses of the resulting fragment ions are 
determined in the second sector of the tandem MS; this 
experimentally determined CID spectrum is then corre- 
lated with the CID spectra predicted from all the pep- 
tides in a sequence database which have essentially the 
same mass as the peptide selected for CID; this correla- 
tion matches the isolated peptide with a sequence seg- 
ment in a database and thus identifies the protein from 
which the peptide was derived. There are a number of 
alternative programs which use peptide CID spectra for 
protein identification, but we use the SEQUEST system 
because it is currently the most highly automated pro- 
gram and has proven to be successful, versatile and 
robust. 

3.3 Protein identification by LC-MS/MS, capillary 
LC-MS/MS and CE-MS/MS 

It has been demonstrated repeatedly that MS has a very 
high intrinsic sensitivity. For the routine analysis of gel- 
separated proteins at high sensitivity, the most signif- 
icant challenge is the handling of small amounts of 
sample. The crux of the problem is the extraction and 
transferal of peptide mixtures generated by the digestion 
of low nanogram amounts of protein, from gels into the 
MS/MS system without significant loss of sample or 
introduction of unwanted contaminants. We employ 
three different systems for introducing gel-purified sam- 
ples into an MS, depending on the level of sensitivity 



required. As an approximate guideline, for samples con- 
taining tens of picomoles of peptides, LC-MS/MS is 
most appropriate; for samples containing low picomole 
amounts to high fern to mole amounts we use capillary 
LC-MS/MS; and for samples containing femtomoles or 
less, CE-MS/MS is the method of choice. 

3.3.1 LC-MS/MS 

The coupling of an MS to an HPLC system using a 
0.5 mm diameter or bigger reverse phase (RP) column 
has been described in detail [42]. This system has several 
advantages if a large number of samples are to be ana- 
lyzed and all are available in sufficient quantity. The 
LC-MS and database searching program can be run in a 
fully automated mode using an autosampler, thus maxi- 
mizing sample throughput and minimizing the need for 
operator interference. The relatively large column is 
tolerant of high levels of impurities from either gel prep- 
aration or sample matrix. Lastly, if configured with a 
flow-splitter and micro-sprayer [40], analyses can be per- 
formed on a small fraction of the sample (less than 5 %) 
while the remainder of the sample is recovered in very 
pure solvents. This latter feature is particularly useful 
when an orthogonal technique is also used to analyze 
peptide fractions, such as scintillation of an introduced 
radiolabel, and this data can be correlated with peptides 
identified by CID spectra. 

3.3.2 Capillary LC-MS 

An increase of sensitivity of approximately tenfold can be 
achieved by using a capillary LC system with a 100 urn ID 
column rather than a 0.5 mm ID column as referred to 
above. Since very low flow rates are required for such 
columns, most reports have used a precolumn flow split- 
ting system for producing solvent gradients. We have 
recently desribed the design and construction of a novel 
gradient mixing system which enables . the formation 
of reproducible gradients at very low flow rates (low 
nL/min) without the need for flow splitting (A. Ducret 
et a/., submitted for publication). Using this capillary 
LC-MS/MS system we were able to identify gel-separat- 
ed proteins if low picomole to high femtomole amounts 
were loaded onto the gel [40]. This system is as yet not 
automated and, like all capillary LC systems, is prone to 
blockage of the columns by microparticulates when ana- 
lyzing gel-separated proteins. 

3.3.3 CE-MS/MS 

The highest level of sensitivity for analyzing gel-sep- 
arated proteins can be achieved by using capillary elec- 
trophoresis — mass spectrometry (CE-MS). We have de- 
scribed in the past a solid-phase extraction capillary elec- 
trophoresis (SPE-CE) system which was used with triple 
quadrupole and ion trap ESI-MS/MS systems for the 
identification of proteins at the low femtomole to sub- 
femtomole sensitivity level [43, 44J. While this system is 
highly sensitive, its operation is labor-intensive and its 
operation has not been automated. In order to devise an 
analytical system with both the sensitivity of a CE and 
the level of automation of LC, we have constructed 
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Figure 3. Schematic illustration of a 
microfabricated analytical system for CE, 
consisting of a micromachined device, 
coated capillary electroosmottc pump, 
and microelectrospray interface. The 
dimensions of the channels and reservoir 
are as indicated in the text. The channels 
on the device were graphically enhanced 
to make them more visible. Reproduced 
from (45 J, with permission. 



rnicrofabricated devices for the introduction of samples 
into ESI-MS for high-sensitivity peptide analysis. 

The basic device is a piece of glass into which channels 
of 10-30 urn in depth, and 50-70 urn in diameter are 
etched by using photolithography/etching techniques 
similar to the ones used in the semiconductor industry. 
(A simple device is shown in Fig. 3). The channels are 
connected to an external high voltage power supply [45]. 
Samples are manipulated on the device and off the 
device to the MS by applying different potentials to the 
reservoirs. Ibis creates a solvent flow by electroosmotic 
pumping which can be redirected by changing the posi- 
tion of the electrode. Therefore, without the need for 
valves or gates and without any external pumping, the 
flow can be redirected by simply switching the position 
of the electrodes on the device. The direction and rate of 
the flow can be modulated by the size and the polarity 
of the electric field applied and also by the charge state 
of the surface. 

The type of data generated by the system is illustrated in 
Fig. 4, which shows the mass spectrum of a peptide sample 
representing the tryptic digest of carbonic anhydrase at 
290 fmoI/uL. Each numbered peak indicates a peptide suc- 
cessfully identified as being derived from carbonic an- 



hydrase. Some of the unassigned signals maybe chemical 
or peptide contaminants. The MS is programmed to auto- 
matically select each peak and subject the peptide to CID. 
The resulting CID spectra are then used to identify the 
protein by correlation with sequence databases. Therefore, 
this system allows us to concurrently apply a number of 
protein digests onto the device, to sequentially mobilize 
the samples, to automatically generate CID spectra of 
selected peptide ions and to search sequence databases 
for protein identification. These steps are performed auto- 
matically without the need for user input and proteins can 
be identified at very low femtomole level sensitivity at a 
rate of approximately one protein per 15 min. . 

3.4 Assessment of 2-DE-MS proteome technology 

Using a combination of the analytical techniques de- 
scribed above we have identified the 80 protein spots 
indicated in Fig. 5. The protein pattern was generated by 
separating a total of 40 microgram of protein contained 
in a total cell lysate of the yeast strain YPH499 by high 
resolution 2-DE and silver staining of the separated pro- 
teins. To estimate how far this type of proteome analysis 
can penetrate towards the identification of low abun- 
dance proteins, we have calculated the codon bias of the 
genes encoding the respective proteins. Codon bias is a 




Figure 4. MS spectrum of a tryptic digest 
of carbonic anhydrase using the rnicrofa- 
bricated system shown in Fig. 3. 290 
fmol/uL of carbonic anhydrase tryptic 
digest was infused into a Fionigan LCQ 
ion trap MS. Each peak was selected for 
CID, and those which were identified as 
containing peptides derived from car^ 
bonic anhydrase are numbered. Repro- 
duced from [45), with permission. 
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Figure 5. 2-DE separation of a lysate of yeast cells, with identified proteins highlighted. The first dimension of separation was an IPG from 
pH 3-10, and the second dimension was a 10%T SDS-PAGE gel. Proteins were visualized by silver staining. Further details of experimental 
procedures are included in S. P. Gygi et a!, (submitted). 



calculated measure of the degree of redundancy of trip- 
let DNA codons used to produce each amino acid in a 
particular gene sequence. It has been shown to be a 
useful indicator of the level of the protein product of a 
particular gene sequence present in a cell [46]. The gen- 
eral rule which applies is that the higher the value of the 
codon bias calculated for a gene, the more abundant the 
protein product of that gene becomes. The calculated 
codon bias values corresponding to the proteins identi- 
fied in Fig. 5 are shown in Fig. 6b. Nearly all of the pro- 
teins identified (> 95%) have codon bias values of > 0.2, 
indicating they are highly abundant in cells. In contrast, 
codon bias values calculated for the entire yeast genome 
(Fig. 6a) show that the majority of proteins present in 
the proteome have a codon bias of < 0.2 and are thus of 
low abundance. 

This finding is of considerable importance in our assess- 
ment of the current status of proteome analysis technol- 
ogy. It is clear that even using highly sensitive analytical 
techniques, we are only able to visualize and identify the 



more abundant proteins. Since many important regula- 
tory proteins are present only at low abundance, these 
would not be amenable to analysis using such tech- 
niques. This situation would be exacerbated in the anal- 
ysis of proteomes containing many more proteins than 
the approximately 6000 gene products' present in yeast 
cells [16 J. In the analysis of, for example, the proteome 
of any human cells, there are potentially 50000—100 000 
gene products [47]. Inherent limitations on the amount 
of protein that can be loaded on 2-DE, and the number 
of components that can be resolved, indicate that only 
the most highly abundant fraction of the many gene 
products could be successfully analyzed. One approach 
that has been employed to circumvent these limitations 
is the use of very narrow range immobilized pH gradient 
strips for the first-dimension separation of 2-DE [48]. 
Since only those proteins, which focus within the narrow 
range will enter the second dimension of separation, a 
much higher sample loading within the desired range is 
possible. This, in turn, can lead to the visualization and 
identification of less abundant proteins. 
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Figure 6. Calculated codon bias values for yeast proteins. (A) Distribu- 
tion of calculated values for the entire yeast proteome. (B) Distribu- 
tion of calculated values for the subset of 80 identified proteins also 
shown in Figs. 1 and 5. Further details of experimental procedures are 
included in S. P. Gygi et al. (submitted). 



4 Utility of proteome analysis for biological 
research 

For the success of proteomics as a. mainstream approach 
to the analysis of biological systems it is essential to 
define how proteome analysis and biological research 
projects intersect. Without a clear plan for the implemen- 
tation of proteome-type approaches into biological re- 
search projects the full impact of the technology can not 
be realized. The literature indicates that proteome anal- 
ysis is used both as a database/data archive, and as a bio- 
logical assay or biological research tool. 



4.1 Hie proteome as a database 



The use of proteomics as a database or data archive 
essentially entails an attempt to identify all the proteins 
in a cell or species and to annotate each protein with the 
known biological information that is relevant for each 
protein. The level of annotation can, of course, be exten- 
sive. The most common implementation of this idea is 
the separation of proteins . by high resolution 2-DE, the 
identification of each detected protein spot and the 
annotation of the protein spots in a 2-DE gel database 
format. This approach is complicated by the fact that it is 
difficult to precisely define a proteome and to decide 
which proteome should be represented in the database. 
In contrast to the genome of a species, which is essen- 
tially static, the proteome is highly dynamic. Processes 
such as differentiation, ceil activation and disease can all 
significantly change the proteome of a species. This is 
illustrated in Fig. 7. The figure shows two high-resolu- 
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tion 2-DE maps of proteins isolated from rat serum. 
Fig. 7A is from the serum of normal rats, while Fig. 7B 
is from the serum of rats in acute-phase serum after 
prior treatment with an inflammation-causing agent [49]. 
It is obvious that the protein patterns are significantly 
different in several areas, raising the question of exactly 
which proteome is being described. 

Therefore, a comprehensive proteome database of a spe- 
cies or cell type needs to contain all of the parameters 
which describe the state and the type of the cells from 
which the proteins were extracted as well as the software 
tools to search the database with queries which reflect 
the dynamics of biological systems. A comprehensive 
proteome database should be capable of quantitatively, 
describing the fate of each protein if specific systen* 
and pathways are activated in the cell. Specifically, thft 
quantity, the degree of modification, the subcellular loca- 
tion and the nature of molecules specifically interacting 
with a protein as well as the rate of change of these 
variables should be described. Using these admittedly 
stringent criteria, there is currently no comlete proteome 
database. A number of such databases are, however, in 
the process of being constructed. The most advanced 
among them, in our opinion, are the yeast protein data- 
base YPD [50] (accessible at http://www.ypd.com) and 
the human 2D- PAGE databases of the Danish Centre 
for Human Genome Research [12] (accessible at http:// 
biobase.dk/cgi-bin/celis). While neither can be con- 
sidered complete as not all of the potential gene pro- 
ducts are identified, both contain extensive annotation 
of supplemental information for many of the spots 
which are positively identified in reference samples. 

4.2 The proteome as a biological assay 

The use of proteome analysis as a biological assay or 
research tool represents an alternative approach to inte- 
grating biology with proteomics. To investigate the state 
of a system, samples are subjected to a specific proceess 
that allows the quantitative or qualitative measurement 
of some of the variables which describe the system. In 
typical biochemical assays one variable (e.g., enzyme 
activity) of a single component (e.g., a particular en- 
zyme) is measured. Using proteomics as an assay, mul- 
tiple variables (e.g., expression level, rate of synthesis, 
phosphorylation state, etc.) are measured concurrently 
on many (ideally all) of the proteins in a sample. The 
use of proteomics as an assay is a less far-reaching prop- 
osition than the construction of a comprehensive pro- 
teome database. It does, however, represent a pragmatic 
approach which can be adapted to investigate specific 
systems and pathways, as long as the interpretation of 
the results takes into account that with current technol- 
ogy not all of the variables which describe the system 
can be observed (see Section 3.4). 

A common implementation of proteome analysis, as a 
biological assay is when a 2-DE protein pattern gener- 
ated from the analysis of an experimental sample is 
compared to an array of reference patterns representing 
different states of the system under investigation. The 
state of the experimental system at the time the sample 
was generated is therefore determined by the quantita- 
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tive comparative analysis of hundreds to a few thousand 
proteins. Comparative analysis of the 2-DE patterns fur- 
thermore highlights quantitative and qualitative differ- 
ences in the protein profiles which correlate with the 
state of the system. For this type of analysis it is not 
essential that all the proteins are identified or even visu- 



alized, although the results become more informative as 
more proteins are compared. It is obvious, however, that 
the possibility to identify any protein deemed character- 
istic for a particular state dramatically enhances this 
approach by opening up new avenues for experimenta- 
tion. 
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Figure 7, High resolution 2-DE map of proteins isolated from rat serum with or without prior exposure to an inflam- 
mation-causing agent. (A) normal rat serum, (B) acute-phase serum from rats which had previously been exposed to 
an inflammation-causing agent. The first dimension of separation is an IPG from pH 4-10, and the second dimen- 
sion is a 7.5-17.5%T gradient SDS-PAGE gel. Proteins were visualized by staining with amido black. Further details 
of experimental procedures are included in [14, 49]. 
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Proteome analysis as a biological assay has been success- 
fully used in the field of toxicology, to characterize 
disease states or to study differential activation of cells. 
The approach is limited, of course, by the fact that only 
the visible protein spots are included in the assay\ and it 
is well known that a substantial but far from complete 
fraction of cellular proteins are detected if a total cell 
iysate is separated by 2-DE. Proteins may not be 
detected in 2-DE gels because they are not abundant 
enough to be visualized by the detection method used, 
because they do not migrate within the boundaries (size, 
pi) resolved by the gel, because they are not soluble 
under the conditions used, or for other reasons. 

A different way to use proteome analysis as a biological 
assay to define the state of a biological system is to take 
advantage of the wealth of information contained in 
2-DE protein patterns. 2-DE is referred to as two-dimen- 
sional because of the electrophoretic mobility and the 
isoelectric points which define the position of each pro- 
tein in a 2-DE pattern. In addition to the two dimen- 
sions used to generate the protein patterns, a number of 
additional data dimensions are contained in the protein 
patterns. Some of these dimensions such as protein 
expression level, phosphorylation state, subcellular loca- 
tion, association with other proteins, rate of synthesis or 
degradation indicate the activity state of a protein or a 
biological system. Comparative analysis of 2-DE protein 
patterns representing different states is therefore ideally 
suited for the detection, identification and analysis of 
suitable markers. Once again it must be emphasized that 
in this type of experiment only a fraction of the cellular 
proteins is analyzed.. Since many regulatory proteins are 
of low abundance, this limitation is a concern, particu- 
larly in cases in which regulatory pathways are being 
investigated. 

5 Concluding remarks 

In this report we have addressed three main issues 
related to proteome analysis. First, we have discussed 
the rationale for studying proteomes. Second, we have 
assessed the technical feasibility of analyzing proteomes 
and described current proteome technology, and third, 
we have analyzed the utility of proteome analysis for bio- 
logical research. It is apparent that proteome analysis is 
an essential tool in the analysis of biological systems. 
The multi-level control of protein synthesis and degrada- 
tion in cells means that only the direct analysis of 
mature protein products can reveal their correct identi- 
ties, their relevant state of modification and/or associa- 
tion and their amounts. Recently developed methods 
have enabled the identification of proteins at ever- 
increasing sensitivity levels and at a high level of auto- 
mation of the analytical processes. A number of tech- 
nical challenges, however, remain. While it is currently 
possible to identify essentially any protein spots that can 
be visualized by common staining methods, it is ap- 
parent that without prior enrichment only a relatively 
small and highly selected population of long-lived, 
highly expressed proteins is observed. There are many 
more proteins in a given cell which are not visualized by 
such methods. Frequently it is the low abundance pro- 
teins that execute key regulatory functions. 
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We have outlined the two principal ways proteome anal- 
ysis is currently being used to intersect with biological 
research projects: the proteome as a database or data 
archive and proteome analysis as a biological assay. Both 
approaches have in common that at present they are con- 
ceptually and technically limited. Current proteome data- 
bases typically are limited to one cell type and one state 
of a cell and therefore do not account for the dynamics 
of biological systems. The use of proteome analysis as a 
biological assay can provide a wealth of information, but 
it is limited to the proteins detected and is therefore not 
truly proteome-wide. These limitations in proteomics are 
to a large extent a reflection of the fact that proteins in 
their fully processed form cannot easily be amplified and 
are therefore difficult to isolate in amounts sutTicientJbr 
analysis or experimentation. The fact that to datefno 
complete proteome has been described further attests to 
these difficulties. With continued rapid progress in pro- 
tein analysis technology, however, we anticipate that the 
goal of complete proteome analysis will eventually 
become attainable. 
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High-throughput technologies, such as proteomic screening and DNA micro-arrays, produce vast 
amounts of data requiring comprehensive analytical methods to decipher the biologically relevant 
results. One approach would be to manually search the biomedical literature; however, this would be 
an arduous task. We developed an automated literature-mining tool, termed MedGene, which 
comprehensively summarizes and estimates the relative strengths of all human gene-disease 
relationships in Medline. Using MedGene, we analyzed a novel micro-array expression dataset 
comparing breast cancer and normal breast tissue in the context of existing knowledge. We found no 
correlation between the strength of the literature association and the magnitude of the difference in 
expression level when considering changes as high as 5-fold; however, a significant correlation was 
observed (r = 0.41; p = 0.05) among genes showing an expression difference of 10-fold or more. 
InterestTnglyT this only"heid trueYor eslroc^ racej^ 

identified a set of relatively understudied, yet highly expressed genes in ER negative tumors worthy of 
further examination. 

Keywords: bio informatics • micro-array • text mining • gene-disease association • breast cancer 



Introduction 

At its current pace, the accumulation of biomedical literature 
outpaces the ability of most researchers and clinicians to stay 
abreast of their own immediate fields, let alone cover a broader 
range of topics. For example, to follow a single disease, e.g., 
breast cancer, a researcher would have had to scan 130 different 
journals and read 27 papers per day in 1999. 1 This problem is 
accentuated with high-throughput technologies such as DNA 
micro-arrays and proteomics, which require the analysis of 
large datasets involving thousands of genes, many of which are 
unfamiliar to a particular researcher. In any microarray experi- 
ment, thousands of genes may demonstrate statistically sig- 
nificant expression changes, but only a fraction of these may 
be relevant to the study. The ability to interpret these datasets 
would be enhanced if they could be compared to a compre- 
hensive summary of what is known about all genes. Thus, there 
is a need to summarize existing knowledge in a format that 
allows for the rapid analysis of associations between genes and 
diseases or other specific biological concepts. 

One solution to this problem is to compile structured digital 
resources, such as the Breast Cancer Gene Database 1 and the 
Tumor Gene Database. 2 However, as these resources are hand- 
curated, the labor-intensive review process becomes a rate- 
limiting step in the growth of the database. As a result, these 
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databases have a limited scale and the genes are not selected 
in a systematic fashion. 

An alternative approach is automated text mining; a method 
which involves automated information extraction by searching 
documents for text strings and analyzing their frequency and 
context. This approach has been used successfully in several 
instances for biological applications. In most cases, it has been 
applied to extract information about the relationships or 
interactions that proteins or genes have with one another, in 
the literature or by functional annotation. 3 " 7 Thus far, few 
publication have applied text-mining to examine the global 
relationships between genes and diseases. Perez-Iratxeta et al. 
automatically examined the GO (Gene Ontology) annotation 
of genes and their predicted chromosomal locations in order 
to identify genes linked to inherited disorders. 8 

To obtain a more global understanding of disease develop- 
ment, it would be valuable to incorporate information regarding 
all possible gene-disease relationships, including biochemical, 
physiological, pharmacological, epidemiological, as well as 
genetic. This information would enable comprehensive com- 
parisons between large experimental datasets and existing 
knowledge in the literature. This would accomplish two things. 
First, it would serve to validate experiments by demonstrating 
that known responses occur as predicted. Second, it would 
rapidly highlight which genes are corroborated by the literature 
and which genes are novel in a given context. We have utilized 
a computational approach to literature mining to produce a 
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comprehensive set of gene-disease relationships. In addition, 
we have developed a novel approach to assess the strength of 
each association based on the frequency of citation and co- 
citation. We applied this tool to help interpret the data from a 
large micro-array gene expression experiment comparing 
normal and cancerous breast tissue. 



Methods 

MedGene Database. MedGene is a relational database, stor- 
ing disease and gene information from NCBI, text mining re- 
sults, statistical scores, and hyperlinks to the primary lit- 
erature. MedGene has a web-based user interface for users to 
query the database (http://hipseq.med.harvard.edu/MedGene/). 

Text Mining Algorithms. MeSH files were downloaded from 
the MeSH web site at NLM (Nation Library of Medicine) (http:// 
www.nlm.nih.gov/mesh/meshhome.htmI) and human disease 
categories were selected. LocusLink files were downloaded from 
the LocusLink web site at NCBI (http://www.ncbi.nih.gov/ 
LocusLink/). Official/preferred gene symbol, official/preferred 
gene name, and gene alternative symbols and names, all 
relevant annotations and URLs for each LocusLink record, were 
collected. Gene search terms were used for literature searching 
and included all qualified gene names, gene symbols, and gene 
family terms. Primary gene keys, predominantly qualified gene 
family terms and gene official/preferred symbols, were used 
to ind ex Medlin e records. If the official/preferred gene symbols 
did not meet the standards to be an index, then qualified gene 
official/preferred names were used. A local copy of Medline 
records (up to July, 2002) was pre-selected. 

A JAVA module examined the MeSH terms and then indexed 
each Medline record with the appropriate disease terms. A 
separate JAVA module was used to examine the titles and 
abstracts for gene search terms and then to index the gene- 



Breast Tissue Micro-Arrays. Eighty-nine breast cancer 
samples (79% ER-positive) and 7 normal breast tissue samples 
were selected from the Harvard Breast SPORE frozen tissue 
repository and were representative of the spectrum of histo- 
logical types, grades, and hormone receptor immuno-pheno- 
types of breast cancer. Biotinylated cRNA, generated from the 
total RNA extracted from the bulk tumor, was hybridized to 
Affymetrix U95A oligo-nucleotide micro-arrays. These micro- 
arrays consist of 12 400 probes, which represent approximately 
9000 genes. Raw expression values were obtained using GENE- 
CHIP software from Affymetrix, and then further analyzed using 
the DNA-Chip Analyzer (dChip) custom software. 

Results 

Automated Indexing of Medline Records by Disease and 
Gene. To study the gene-disease associations in the literature, 
we first compiled complete lists for human diseases and human 
genes. To index all Medline records that were relevant to 
human diseases, the Medical Subject Heading (MeSH) index 
of Medline records was utilized. MeSH is a controlled medical 
vocabulary from the National Libraiy of Medicine and consists 
of a set of terms or subject headings that are arranged in both 
an alphabetic and an hierarchical structure. Medline records 
are reviewed manually and MeSH terms are added to each with 
software assistance. 910 Twenty-three human disease category 
headings along with all of their child terms (see the Supporting 
Information, Supplemental Table 1, or visit http://hipseq. 
~med[Harvax£eJu7^ 

selected from the 2002 MeSH index creating a list of 4033 
human diseases. 

No index comparable to the MeSH index exists for genes, 
and thus, it was necessary to apply a string search algorithm 
for gene names or symbols found in Medline text A complete 
list of genes, gene names, gene symbols, and frequently used 



related Medline records with the relevant primary gene key(s). 

Statistical Methods. For every gene and disease pair, we 
counted records that were indexed for both gene and disease 
(double positive hits), for disease only (disease single hits), for 
gene only (gene single hits), and for neither gene nor disease 
(double negative hits) to generate a 2 x 2 contingency table. 
On the basis of the contingency table-framework, we applied 
different statistical methods to estimate the strength of gene- 
disease relationships and evaluated the results. These methods 
included chi-square analysis, Fisher's exact probabilities, rela- 
tive risk of gene, and relative risk of disease 16 (http:// 
hipseq.med.harvard.edu/MedGene/). In addition, we computed 
the "product of frequency which is the product of the 
proportion of disease/gene double hits to disease single hits 
and the proportion of disease/gene double hits to gene single 
hits. To obtain a normal distribution, we transformed all the 
statistical scores using the natural logarithm. We selected the 
log of the product of frequency (LPF) to validate MedGene and 
to use for the analysis with the micro-array data. Spearman 
rank-correlation coefficients were used to assess the linear 
relationship between LPF and micro-array fold change in 
expression level. 

Global Analysis. Diseases with at least 50 related genes were 
selected for clustering analysis, and the LPF scores were 
normalized with total score for each disease. Hierarchical 
clustering was done with the "Cluster" software and the 
clustering result was visualized using TreeViewer" (http:// 
rana.lbl.gov/EisenSoftware.htm) . 



synonyms were collected from the LocusLink database at 
NCBI, 11 - 12 which contains 53 259 independent records keyed 
by an official gene symbol or name (June 18 th , 2002). For the 
purposes of this study, no distinction was made between genes 
and their gene products. Authors often use the same name for 
both, differentiating the two only by the use of italics, if at all. 
For the intended use of this study, this lack of distinction is 
unlikely to have a large effect and may in fact be beneficial. 

Initial attempts to search the literature using these lists 
revealed several sources of false positives and false negatives 
(Table 1). False positives primarily arose when the searched 
term had other meanings, whereas false negatives arose from 
syntax discrepancies necessitating the development of filters 
to reduce these errors. The syntax issues were readily handled 
by including alternate syntax forms in the search terms. The 
false positive cases, caused by duplicative and unrelated 
meanings for the terms, were more difficult to manage. Where 
possible, case sensitive string mapping reduced inappropriate 
citations. In many cases, however, this was not sufficient and 
the terms had to be eliminated entirely, thereby reducing the 
false positive rate but unavoidably under-representing some 
genes. 

For the purposes of data tracking, a primary gene key was 
selected to represent all synonyms that correspond to each 
gene. Medline records were indexed with a primary gene key 
when any synonym for that key was found in the tide or 
abstract. Case-insensitive string mapping was used for all 
searches except as noted above. No additional weight was 
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Table 1. Systematic Sources of False Positives and False Negatives in Unfiltered Data 0 



source of error 



error type 



example 



filter solution 



gene symbol/name false positive 

is not unique 



gene symbol is false positive 

unrelated abbreviation 

gene symbol/name false positive 

has language meaning 

nonstandard syntax false negative 

unofficial gene name/symbol false negative 

nonspecified gene name false negative 



M4G-myelin 

associated glycoprotein 
MA Cr- malignancy-associated 

protein 

PA— pallid homologue (mouse), 

pallidin (also abbrev. for Pennsylvania) 

W^45-Wiskott-Aldrich Syndrome 
(also the word "was") 

BAG-1 instead of BAG1 

P53 instead of TP53 

estrogen receptor instead of 
Estrogen receptor 1 



eliminate this term 

eliminate this term 

case-sensitive string search 

add dash term 

add all gene nicknames 

add family stem term 



* In preliminary studies, Medline was searched for co-occurrence of genes and diseases and the resulting output was evaluated to identify error sources that 
were amenable to global Alters. Each error source is categorized by the type of error it causes: false positives are suggested relationships that are not real and 
false negatives are real relationships that are underrepresented. The filter solutions used are indicated. Note that in some cases, the filter solution itself introduces 
error. In general, error rates maximized sensitivity, even at the expense of specificity if needed. 



added for multiple occurrences of a term or the co-occurrence 
of multiple synonyms for the same gene key. 

Medline records were searched with all qualified gene 
identifiers, such as the official/preferred gene symbol, the 
official/preferred gene name, all gene nicknames and all syntax 
variants. In situations where there are several members of a 
gene family or splice variants, some authors prefer to use a 
shortened gene family name, e.g., estrogen receptor instead of 
estrogen receptor 1 (ESRI), creating a source of false negatives. 
-For this reasonrgene famtiy stem ^erms-were created- for all 
genes that have an alpha or numerical suffix (e.g., IL2RA, TGFp, 
ESRI, etc.) and then used to search the literature. The family 
stem terms were handled separately from the specific gene 
names so that it would be clear when linkages were made to 
the gene family versus a specific member in that family. 

To improve performance and accuracy, some pre-selection 
was applied to the records that were scdrmed. First, review 
articles were eliminated to avoid redundant treatment of 
citations. Second, non-English journals were removed because 
the natural language filters were only relevant to English 
publications. Finally, journals unlikely to contain primary data 
about gene-disease relationships were also removed (e.g., Int. 
J. Health Educ, Bedside Nurse, and /. Health Econ). Together, 
these filters reduced the 12 198 221 Medline publications (July 
2002) by 37%. 

Ranking the Relative Strengths of Gene-Disease Associa- 
tions. In total, there were 618 708 gene-disease co-citations, 
in which 16% (8297) of all studied genes had been associated 
to a disease and 96% (3875) of all diseases had been associated 
to at least one gene. To rank the relative strengths of gene 
disease relationships, we tested several different statistical 
methods and examined the results. With the exception of the 
relative risk estimates, the methods provided similar results 
with respect to the rank order of the gene-disease association 
strengths. However, after comparing the results to other 
databases and after consulting disease experts, the log of the 
product of frequency (LPF) was selected for further analysis 
because it gave the best results overall. 

Validation of MedGene. In developing this tool, it was 
important to minimize the number of missed genes (false 
negatives) and miscalled genes (false positives). However, in 
situations when these goals were in conflict, inclusiveness was 
prioritized. To determine the false negative rate in MedGene, 
breast cancer was used as a test case because it was associated 
with more genes than any other human disease and because 




Figure 1. Estimation of the false negative rate by comparison 
with hand-curated databases. The breast cancer-related genes 
identified by MedGene were compared with those listed in 
several other databases including the Tumor Gene Database 
(TGD), 2 the Breast Cancer Gene Database(BCG), 1 GeneCards 
(GC) 17 and Swissprot. 18 Genes were considered false negatives 
if they were represented in at least one of these other databases 
and not in MedGene and their link to breast cancer was sup- 
ported by at least one literature referenced All literature references 
were verified by manual review to confirm their validity. The 
number of genes in each database or shared by more than one 
database is indicated. The false negative rate was calculated by 
genes missed at MedGene (26)/total number of nonoverlapping 
genes in other databases (285). 

there were several public databases that link genes to breast 
cancer. We compared the list of breast cancer-related genes 
from MedGene to these databases, illustrated in Figure 1. 
Among the 285 distinct breast cancer-related genes that were 
supported by at least one literature citation in these hand- 
curated databases, 26 were absent from MedGene, suggesting 
a false negative rate of approximately 9%. To determine why 
these were missed, all literature references for these genes (80 
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papers) were reviewed manually (see the Supporting Informa- 
tion, Supplemental Table 2, or visit http://hipseq.med. 
harvard.edu/MedGene/publication/s_Table 2.html). Among 
these papers, most false negatives were caused by nonstandard 
gene terms or gene terms eliminated by our specificity filters. 
Few genes were missed because they were only mentioned in 
review papers (0.4%) or they appeared only in the body of the 
manuscript but not the abstract or title (1.1%). Of note, 
MedGene identified approximately 2000 additional breast 
cancer-related genes not listed in any other database. 

To assess the false positive error rate, two complementary 
approaches were used: a detailed analysis of one disease and 
a global examination of 1000 diseases. The detailed approach 
examined the false positive error rate and its sources, whereas 
the global approach tested whether the overall results made 
biomedical sense. 

Using the LPF, 1467 genes related to prostate cancer were 
assembled in rank order. We then retrieved approximately 300 
Medline records each for the highest ranked 100 and the lowest 
ranked 200 genes and manually reviewed the titles and 
abstracts to determine the verity of the association. Nearly 80% 
of the highest ranked 100 genes fell into one of the five 
categories that reflect meaningful gene-disease relationships 
(see the Supporting Information, Supplemental Table 3, or visit 
http://hipseq.med.harvard.edu/MedGene/publication/ 
sJTable 3.html). Among the lowest ranked 200 genes, ap- 
proximately 70% reflected true relationships. Of the 600 records 
.reviewed, Jtherejyv.exe only two irLvd\ich_the_associstiQn.beJwqeji_ 
the gene and the disease was described as negative. Both were 
genes with very low scores. In both cases, the authors did not 
argue the absence of any relationship, but rather that a 
particular feature of the gene or protein was not shown to be 
related to human prostate cancer. 13,14 

The coincidence of some gene symbols with medical ab- 
breviations, chemical abbreviation s and biological abbrcvia — 
tions resulted in most of the false positives (see the Supporting 
Information, Supplemental Table 4, or visit http://hipse- 
q.med.harvard.edu/MedGene/publication/s„Table 4.html), em- 
phasizing the importance of the filters that were added in the 
search algorithm (Table 1). Without the filters, the false positive 
rate more than doubled, and the false negative rate rose 
dramatically (data not shown). For example, among the papers 
about breast cancer, there were only 12 Medline records that 
referred to ESR1 and 10 to ESR2 t whereas almost 2000 papers 
mentioned estrogen receptor without specifying ESR1 or ESR2, 
this latter group was detected by the family stem term filter. 

To further validate these results, a global analysis of the gene- 
disease relationships described by MedGene was performed. 
For this experiment, it was reasoned that the more closely 
related the diseases are to one another, the more they will be 
related to the same gene sets. Thus, if the relationships defined 
by MedGene accurately reflected the literature, then an unsu- 
pervised hierarchical clustering of the gene data should group 
diseases in a manner consistent with common medical think- 
ing. Conversely, if the clustered diseases do not make sense 
biologically or medically, it may reflect excessive false positives, 
false negatives, or inappropriate scoring of the data. 

To execute this experiment, the gene sets and the corre- 
sponding LPF values for 1000 randomly selected diseases (each 
with at least 50 gene relationships) were used as a dataset for 
clustering the diseases. A review of the results showed that the 
resulting disease clusters were indeed logical based upon 
common medical knowledge (see the Supporting Information, 



Supplemental Figure 1, or visit http://hipseq.med.harvard.edu/ 
MedGene/publication/s.Figure l.html). For example, in one 
such cluster shown in Figure 2, diabetes and its complications 
grouped together and were also closely linked to diseases 
associated with starvation states. 

The number of genes associated with a given disease can 
be estimated by adjusting the MedGene number up by the false 
negative rate (~9%) and down by the false positive rate (~26% 
on average). Using this, the average disease has 103.7 ± 45.3 
(mean ± s.d.) genes associated with it, although the range is 
quite broad with 2359 genes related to breast cancer, 2122 
genes related to lung cancer and no genes related to a number 
of diseases. 

Applying MedGene to the Analysis of Large Datasets. Access 
to a comprehensive summary of the genes linked to human 
diseases provided an opportunity to analyze data obtained from 
a high-throughput experiment. We compared the MedGene 
breast cancer gene list to a gene expression data set generated 
from a micro-array analysis comparing breast cancer and 
normal breast tissue samples. Micro-array analysis identified 
2286 genes that had greater than a 1-fold difference in mean 
expression level between breast cancer samples and normal 
breast samples. Using MedGene, we sorted the 2286 genes into 
four classes: 555 genes directly linked to breast cancer in the 
literature by gene term search (first-degree association by gene 
name); 328 genes directly linked by family term search (first- 
degree association by family term); 1021 genes linked to breast 

. J^rjcer only through pther breast cancer genes (second r degree 
association); and 505 genes not previously associated with 
breast cancer. (See the Supporting Information, Supplemental 
Figure 2, or visit http://hipseq.med.hawani.edu/MedGene/ 
publication/s_Figure 2.html.) Among the 505 previously un- 
related genes, 467 were either newly identified genes or genes 
that had not previously been associated with any disease. 

— Among the remaining 38 genes , 9 ha d been rel a ted to other — 
cancers, specifically esophageal, colon, uterine, skin, and cervix. 

To determine whether the genes highlighted by the micro- 
array analysis were more likely to have been previously linked 
to breast cancer in the literature, we created a two-dimensional 
plot of the fold change of expression level between breast 
cancer and normal tissue versus the literature score (LPF) 
(Figure 3A). There was a broad spread of expression changes 
among the genes directly linked to breast cancer ranging from 
less than 1-fold change (68%) to over 40-fold (0.3%). Notably, 
the majority of genes with greater than 10-fold expression 
changes were linked to breast cancer by first-degree associa- 
tion. 

Among all 754 genes directly linked to breast cancer in the 
literature, there was no correlation between LPF and micro- 
array fold change (r = 0.018, p-value = 0.62). However, when 
we stratified the analysis based on the magnitude of the fold 
change, we observed an increasing trend in correlation (Figure 
3B) suggesting that genes with a more substantial change in 
expression level were more likely to have a stronger association 
in the literature. For genes that had 10-fold change or more in 
expression level, the correlation increased to 0.41 (p-value = 
0.05). 

When we evaluated the micro-array data separately for ER 
positive and ER negative tumors, the trend in correlation 
between fold change and literature score was highly dependent 
on estrogen receptor status. Interestingly, there was a similar 
trend in correlation for ER positive tumors, but no trend in 
correlation for ER negative tumors. 



408 Journal of Proteome Research • Vol. 2, No. 4, 2003 



Analysis of Data Using Advanced literature Mining 



research articles 




Tnif^llliltalrl^I^iln^nllln^lr- 

llliliyjjpiilijj 

jiijislliili™ 



B 



•©lot* orIk Oatigr 20 (jhMghat* ten— «xt*sj 




Coxsackievirus Infections . 
Obesity in Diabetes 
Diabetic Ketoacidosis 
Glucose Intolerance 

Diabetes Mellitus, Non- Insulin-Dependent 
~ Diabetes "Mellitus", ~ XMuTih-Dependerit 
Pregnancy in Diabetics 
Diabetic Retinopathy 
Diabetic Angiopathies 
Diabetic Neuropathies 
Glycosuria 

Hyper inoullniraa 




Hyper insul inemia 
Hypoglyc emia 
Hyperglycemia 

Diabetes Mellitus, Experimental 
Diabetes Mellitus 
Diabetes, Gestational 




"Ifta1rva~ff6n ~ " 
Jaundice, Neonatal 
Brain Edema 
Pulmonary Edema 
Nutrition Disorders 
Kwashiorkor 
Critical Illness 
Burns 

Diabetic Nephropathies 
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Figure 2. Global validation by clustering analysis. 2(A). The gene sets and the corresponding LPF values for 1000 diseases, each with 
at least 50 gene relationships, were used in an unsupervised clustering of the diseases based on the gene patterns associated with 
them. A sample of the data is shown here. 2(B). One of the resulting clusters is shown that corresponds to blood sugar states. Diabetes 
terms (above the line) and starvation states terms (under the line) clustered together. Within these groups, there is also clustering of 
diabetic small vessel complications, altered serum chemistries, nutritional disorders, etc.(Supplemental Figure 1: http://hipseq.med. 
harvard.edu/MedGene/publication/s_Figure 1 .html). 



Finally, to validate our findings, we computed similar cor- 
relations between the breast cancer expression data and 
LPF scores generated by MedGene for hypertension, a 



disease unrelated to breast cancer. As expected, we did not 
observe an increasing trend in correlation for hyperten- 
sion. 
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Figure 3. Relationship between literature score and functional data for breast cancer. 3A. The data from an expression analysis of 
samples for breast tumors and normal breast tissue were analyzed to indicate the fold difference of expression level between breast 
tumor and normal sample (cutoff > 3-fold change). The fold changes were plotted against the literature score for the same gene set. 
Green dots represent first-degree association by gene search, blue dots represent first-degree association by family search and red 
dots represent no-association. Some well-studied genes, such as BRCA2 (pink circle), are not reflected by a substantial difference in 
expression level. Furthermore, the majority of genes that have no association with breast cancer in the literature had less than 10-fold 
expression changes (shaded area). 3B. The Spearman rank-correlation coefficients between literature score (LPF) and the fold change 
of expression level between tumor and normal breast samples (y-axis) in relation to the amount of fold change of expression level 
(x-axis). Gene rank lists were generated for breast cancer (blue) and hypertension (pink). Correlations were also computed between 
the breast cancer gene LPF scores and fold change expression data among estrogen receptor positive tumors only (light blue) and 
estrogen receptor negative tumors only (purple). 
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breast neoplasms 


tivnp rtpnci nn 


rhpiimAtolH arthritis 

1 llvlUllQIwlU CU U MM life* 


bipolar disorder 


atherosclerosis 


estrogen receptor 


REN 


RA 


ERDA1 


apolipoproteln 


PGR 


DBP 


TNFRSFJOA 


SNAP29 


APOE 


ERBB2 


LEP 


CRP 


PFKL 


LDLR 


BRCA1 


AGT 


AS 


DRD2 


ELN 


BRCA2 


INS 


ESR1 


TRH 


ARG1 


EGFR 


kallikrein 


HLA-DRB1 


IMPA2 


APOB 


CYP19 


ACE 


DR1 


HTR3A 


APOAI 


TFF1 


endothelin 


interleukin 


DKD3 


MSR1 


PSEN2 


S100A6 


TNF 


REM 


LPL 


TP53 


BDK 


IL6 


KCNN3 


PON1 










plasminogen 


CES3 


DIANPH 


collagen 


DRD4 


activator inhibitor 


CEACAM5 


SARI 


ILIA 


HTR2C 


PLG 










vascular cell 


ERBB3 


PIH 


ACR 


RELN 


adhesion molecule 


cyclin 


CD59 


TNFRSFI2 


DBH 


ATOHi 


COX5A 


ALB 


IL2 


MAOA 


VWF 


cathepsln 


CYP11B2 


CHI3LI 


COMT 


INS 


ERBB4 


MAT2B 


IL8 


HTR2A 


ARG2 




angiotensin 




SYNJ1 




TRAM 


receptor 


interleukin 1 


ABCA1 




matrix 






CCND1 


AGTR2 


metalloproteinase 


INPPl 


OLR1 


EGF 


NPPA 


interferon 


NEDD4L 


collagen 


MUCJ 


LVM 


CD68 


FRA13C 


MCP 








transducer of 




insulin-like 


DBH 


IL4 


ERBB2 


lipoprotein 


BCL2 


NPY 


IL17 


BAIAP3 


AP0A2 










intercellular 


mucin 


POMC 


MMP3 


ATP1B3 


adhesion molecule 


FCF3 - 


neuropeptide - - 


— SIL ....... 


.DRD5- . 


- ..RAB27A 



* MedGene results for the top 25 genes associated with breast neoplasms, hypertension, rheumatoid arthritis, bipolar disorder, and atherosclerosis, respectively, 
ranked by LPF scores. The hyperlink to all the papers co-citing the gene and the disease is available at MedGene website (http://hipseq.med.harvard.edu/ 
MedGene/). 



Discussion 

The Human Genome Project heralded a new era in biological 
research where the emphasis on understanding specific path- 
ways has expanded to global studies of genomic organization 
and biological systems. High-throughput technologies can 
provide novel insight into comprehensive biological function 
but also introduces new challenges. The utility of these 
technologies is limited to the ability to generate, analyze, and 
interpret large gene lists. MedGene, a relational database 
derived by mining the information in Medline, was created to 
address this need. MedGene users can query for a rank-ordered 
list of human gene-disease relationships CTable 2) for one or 
more diseases. Each entry is hyperlinked to the original papers 
supporting each association and to other relevant databases. 

MedGene is an innovative extension of previous text mining 
approaches. Perez-Iratxeta et al. used the GO annotation and 
their chromosomal locations to predict genes that may con- 
tribute to Inherited disorders, 8 MedGene takes a broader view 
and includes all diseases and all possible gene-disease relation- 
ships. Furthermore, MedGene utilizes co-citation to indicate a 
relationship rather than GO annotation, which is limited to the 
subset of genes that have GO annotation. Our approach is 
complementary to that taken by Ghaussabel and Sher, who 
used the frequency of co-cited terms to cluster genes into a 
hierarchy of gene-gene relationships. 6 

A unique aspect of this tool is the ability to assess the relative 
strengths of gene-disease relationships based on the frequency 
of both co-citation and single citation. This presupposes that 
most co-citations describe a positive association, often referred 
to as publication bias 15 and is supported by our observations 



that negative associations are rare (Supplemental Table 3: 
http://hipseq.med.harvard.edu/MedGene/pubIication/s_Ta- 
ble 3.html). Of cours e , re lationships e stablish e d by frequ e ncy 
of co-citation do not necessarily represent a true biological link; 
however, it is strong evidence to support a true relationship. 

Another important feature of MedGene is the implementa- 
tion of software filters that substantially reduced the error rate. 
We estimate that less than 10% of all associations were missed 
and at least 70% of even the weakest associations were real. 
For this study, all of the filters that we applied were general 
ones, e!g., expanding the list of all gene names to address the 
different syntax forms used by different journals, eliminating 
gene names that correspond to common English words, etc. 
The majority of the remaining search term ambiguities were 
idiosyncratic and difficult to identify systematically without 
causing a significant rise in false negatives. Alternative ap- 
proaches, such as the examination of the nearest neighbor 
terms, need to be considered to further reduce the false positive 
rate. 

It is not uncommon to see expression changes in micro- 
array experiments as small as 2-fold reported in the literature. 
Even when these expression changes are statistically significant, 
it is not always clear if they are biologically meaningful. When 
comparing expression levels of disease to normal tissue, one 
expects an enrichment of known disease-related genes to 
appear in the altered expression group. MedGene provided a 
unique opportunity to test this notion in the context of existing 
knowledge on a novel breast cancer micro-array dataset. For 
genes displaying a 5-fold change or less in tumors compared 
to normal, there was no evidence of a correlation between 
altered gene expression and a known role in the disease. This 
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Table 3. Genes with Large Expression Changes in ER- but 
Not in ER+ Breast Tumors 



gene symbol 


loiu cnange ijlktj 


fnlH rhance fER- 


KRTHBl 


1.0 


610.8 


BRS3 


1.2 


89.4 


DKK1 


1.2 


69.8 


ZICJ 


1.9 


59.6 


TLR1 


1.0 


38.5 


KIAA0680 


2.6 


33.2 


CDKN3 


1.0 


30.6 


EBI2 


4.0 


27.9 


GZMB 


3.8 


21.9 


STKJ8 


4.7 


18.6 


GPR49 


1.0 


14.6 


MYO10 


1.6 


14.4 


LAD J 


-1.0 


13.5 


POLE2 


4.2 


13.0 


HMG4 


4.4 


12.9 


BCL2L1 1 


-1.2 


12.3 


LRP8 


2.9 


12.2 


CCNB2 

VAJ& 


1.0 


11.8 


CCNE2 


4.0 


11.6 


FGB 


-4.3 


11.1 


KNSL6 


2.9 


10.9 


H1F5 


3.0 


10.2 






10.2 


VAP1 
I sir J 


1 ft 
1*1/ 


10.0 






—10.4 


TCPA9 


— t 1 
1.1 


—10.8 


TJTJTf 


1 ^ 

1 .0 


— 114 


CHI 17A1 


—4.1 


-15.7 


POPS 


1.1 


-16!2 


BPAG1 


4t6 


-22.3 


PDZK1 


-1.1 


-36.8 


VEGFC 


-2.8 


-51.5 


MUC6 


-1.4 


-64.9 


SERPINA5 


-1.0 


-83.1 


MEIS1 


-1.6 


-85.9 


CA12 


2.4 


-150.3 



Table 3. Med Gene identified a set of relatively understudied, yet highly 
expressed genes in fcK negative, out not tK positive breast tumors. All of 
these genes have either never been co-cited with breast cancer or have a 
weak association except those marked with an *. 



reflects the many genes whose role in breast cancer may not 
involve large changes in expression in sporadic tumors (e.g., 
BRCAl and BRCAZi and genes whose modest changes in 
expression may be unrelated to the disease. Strikingly, among 
genes with a 10-fold change or more in expression level, there 
was a strong and significant correlation between expression 
level and a published role in the disease, providing the first 
global validation of the micro-array approach to identifying 
disease-specific genes. 

The results derived from MedGene have two implications. 
First, a careful hunt for corroborating evidence of a role in 
breast cancer should precede any further study of genes with 
less than 5-fold expression level changes. Second, any genes 
with 10-fold changes or more are likely to be related to breast 
cancer and warrant attention. It is likely that this threshold will 
change depending on the disease as well as the experiment. 

Interestingly, the observed correlation was only found among 
ER-positive tumors, not ER-negative. This may reflect a bias 
in the literature to study the more prevalent type of tumor in 
the population. Furthermore, this emphasizes that caution 
must be taken when interpreting experiments that may contain 
subpopulations that behave very differently. The MedGene 
approach identified a set of relatively understudied, yet highly 
expressed genes in ER-negative tumors that are worthy of 
further examination (Table 3). 



In conclusion, we have developed an automated method of 
summarizing and organizing the vast biomedical literature. To 
our knowledge, the resulting database is the most comprehen- 
sive and accurate of its kind. By generating a score that reflects 
the strength of the association, it provides an important tool 
for the rapid and flexible analysis of large datasets from various 
high-throughput screening experiments. Furthermore, it can 
be used for selecting subsets of genes for functional studies, 
for building disease-specific arrays, for looking at genes com- 
mon to multiple diseases and various other high-throughput 
applications. In the future, it will be possible to enhance the 
utility of the MedGene database by building links between 
genes and other MeSH terms as well as other biological 
processes and concepts, such as cell division and responses to 
small molecules. 
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Each year, over 182,000 women in the United States are 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease. 1 Incidence appears to be increasing in the 
United States at a rate of roughly 2% per yean The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role. 2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

' HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 1 0%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma. 3 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistry 
(1HC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion™ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
ticfh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification. 4 HER-2/neu 
status may be particularly important to establish in women with 
small (£1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- 
based therapy, while those with normal HER-2/neu levels did 
not The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death. 5 Demonstration of HER- 
2meu gene amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin 0 (Trastuzurhab) mono- 
clonal antibody therapy, however, is based upon demonstra- 
tion of HER-2/neu protein overexpression using HercepTest™. 
Studies using Herceptin 0 in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin 0 in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTest 0 ) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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CPT code information 

HER-2/neu via IHC 

88342 (including interpretive report) 

HER-2/neu via FISH 

88271 *2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- 
ization, analyze 25-99 cells 
8 829 1 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 

Procedural Information 

Immunohistochemistry is performed using the FDA-approved 
DAKO antibody kit, Herceptest 0 . The DAKO kit contains 
reagents required to complete a two-step immunohisto* 
chemical staining procedure for.routinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dextran polymer backbone, thus eliminating the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion of the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site* The specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 
results. 

FISH analysis at SHMC/PAML is performed using the 
FDA-approved PathVysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and the second for the HER- 
2/neu oncogene located at 1 7q 1 1 .2- 1 2 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
of cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
result, ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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